Skip to content

perf: optimize serialization tests for HTTP-backed services#2525

Open
BrendanWalsh wants to merge 1 commit intomasterfrom
brwals/optimize-serialization-tests
Open

perf: optimize serialization tests for HTTP-backed services#2525
BrendanWalsh wants to merge 1 commit intomasterfrom
brwals/optimize-serialization-tests

Conversation

@BrendanWalsh
Copy link
Copy Markdown
Collaborator

Summary

Reduces redundant API calls in serialization round-trip tests by comparing DataFrame schemas instead of full data for HTTP-backed transformers. This cuts API usage by ~56% across 61 test suites, significantly reducing 429 rate-limit pressure.

Changes

  • Fuzzing.scala: Add compareDataInSerializationTest flag (default true) to TestObject. When false, serialization tests compare only schemas instead of executing full API round-trips.
  • 61 test suites (*Suite.scala): Override compareDataInSerializationTest = false for HTTP-backed cognitive service transformers where the serialization round-trip test's value is in verifying serialize/deserialize correctness, not re-calling the API.

Motivation

Every serialization test was calling the backing HTTP API twice (once for original, once for deserialized pipeline). For services like OpenAI, Bing, Translator, etc., this doubles API costs and rate-limit exposure without testing anything new — the serialization correctness is verified by schema comparison.

Testing

  • Serialization tests still execute and verify round-trip correctness (schema comparison)
  • Transformers without the override are unaffected (full data comparison preserved)
  • Validated in container CI builds (all 61 affected suites pass)

Split from #2506 for independent review.

Copilot AI review requested due to automatic review settings March 27, 2026 05:27
@github-actions
Copy link
Copy Markdown

Hey @BrendanWalsh 👋!
Thank you so much for contributing to our repository 🙌.
Someone from SynapseML Team will be reviewing this pull request soon.

We use semantic commit messages to streamline the release process.
Before your pull request can be merged, you should make sure your first commit and PR title start with a semantic prefix.
This helps us to create release messages and credit you for your hard work!

Examples of commit messages with semantic prefixes:

  • fix: Fix LightGBM crashes with empty partitions
  • feat: Make HTTP on Spark back-offs configurable
  • docs: Update Spark Serving usage
  • build: Add codecov support
  • perf: improve LightGBM memory usage
  • refactor: make python code generation rely on classes
  • style: Remove nulls from CNTKModel
  • test: Add test coverage for CNTKModel

To test your commit locally, please follow our guild on building from source.
Check out the developer guide for additional guidance on testing your change.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 27, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Snapshot Warnings

⚠️: No snapshots were found for the head SHA 5bd69b6.
Ensure that dependencies are being submitted on PR branches and consider enabling retry-on-snapshot-warnings. See the documentation for more information and troubleshooting advice.

Scanned Files

None

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces redundant external HTTP/API calls in serialization round-trip tests for HTTP-backed cognitive service stages by allowing suites to switch from full DataFrame equality checks to schema-only verification.

Changes:

  • Added a compareDataInSerializationTest switch in serialization fuzzing to optionally avoid full data collection/comparison.
  • Updated many HTTP-backed cognitive service suites to disable data comparison during serialization tests.
  • Removed several suite-specific assertDFEq overrides that were only needed for serialization data comparisons.

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
core/src/test/scala/com/microsoft/azure/synapse/ml/core/test/fuzzing/Fuzzing.scala Adds compareDataInSerializationTest and changes serialization comparison behavior.
cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/vision/ComputerVisionSuite.scala Disables serialization data comparison for multiple Vision HTTP-backed suites; removes custom assertDFEq overrides.
cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/translate/TranslatorSuite.scala Disables serialization data comparison for Translator suites; removes data-comparison customization code.
cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/text/TextAnalyticsSuite.scala Disables serialization data comparison for Text Analytics fuzzing base.
cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/speech/TextToSpeechSuite.scala Disables serialization data comparison for Text-to-Speech suite.
cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/speech/SpeechToTextSuite.scala Disables serialization data comparison; removes custom assertDFEq override.
cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/speech/SpeechToTextSDKSuite.scala Disables serialization data comparison for Speech SDK suites.
cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/speech/SpeakerEmotionInferenceSuite.scala Disables serialization data comparison for Speaker Emotion Inference.
cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/openai/OpenAIResponsesSuite.scala Disables serialization data comparison; removes custom assertDFEq override.
cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/openai/OpenAIPromptSuite.scala Disables serialization data comparison for OpenAI Prompt.
cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/openai/OpenAIEmbeddingsSuite.scala Disables serialization data comparison for OpenAI Embeddings.
cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/openai/OpenAICompletionSuite.scala Disables serialization data comparison; removes custom assertDFEq override.
cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/openai/OpenAIChatCompletionSuite.scala Disables serialization data comparison; removes custom assertDFEq override.
cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/language/AnalyzeTextSuite.scala Disables serialization data comparison for multiple Language service suites.
cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/language/AnalyzeTextLROSuite.scala Disables serialization data comparison for LRO Language service suites.
cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/geospatial/AzureMapsSuite.scala Disables serialization data comparison; removes custom assertDFEq overrides.
cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/form/FormRecognizerV3Suite.scala Disables serialization data comparison; removes custom assertDFEq override.
cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/form/FormRecognizerSuite.scala Disables serialization data comparison across multiple Form Recognizer suites.
cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/form/FormOntologyLearnerSuite.scala Disables serialization data comparison; removes custom assertDFEq override.
cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/face/FaceSuite.scala Disables serialization data comparison for Face API suites; removes custom assertDFEq override.
cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/anomaly/AnamolyDetectionSuite.scala Disables serialization data comparison for Anomaly Detection suites.
cognitive/src/test/scala/com/microsoft/azure/synapse/ml/services/aifoundry/AIFoundryChatCompletionSuite.scala Disables serialization data comparison; removes custom assertDFEq override.
Comments suppressed due to low confidence (3)

core/src/test/scala/com/microsoft/azure/synapse/ml/core/test/fuzzing/Fuzzing.scala:486

  • The PR description says the flag was added to TestObject, but the implementation adds it as a val on SerializationFuzzing (per-suite), so it can’t be configured per test object. Either update the PR description to match the code, or move the flag onto TestObject if per-object behavior is needed (e.g., mixed HTTP + non-HTTP stages in one suite).
  val compareDataInSerializationTest: Boolean = true

  private def testSerializationHelper(path: String,

core/src/test/scala/com/microsoft/azure/synapse/ml/core/test/fuzzing/Fuzzing.scala:500

  • Wrapping fit(...).transform(...) in Future/Await here likely doesn’t provide the intended parallelism for the expensive work: assertDFEq ultimately does sequential collect() calls (see DataFrameEquality). Meanwhile, running the two fit operations concurrently can increase external API concurrency and test flakiness, and Duration.Inf can hang CI indefinitely. Prefer sequential execution here, or use a bounded timeout and only parallelize the actual actions intentionally (with care around Spark concurrency).
          if (compareDataInSerializationTest) {
            val f1 = Future(e1.fit(fitDF).transform(transDF))
            val f2 = Future(e2.fit(fitDF).transform(transDF))
            assertDFEq(Await.result(f1, Duration.Inf), Await.result(f2, Duration.Inf))
          } else {

core/src/test/scala/com/microsoft/azure/synapse/ml/core/test/fuzzing/Fuzzing.scala:513

  • Same concern for the transformer branch: Future(t.transform(...)) only constructs a lazy DataFrame and won’t execute the HTTP-backed work until an action, so this doesn’t reduce wall-clock time but does add concurrency/complexity. Also, Duration.Inf risks indefinite hangs if a future never completes. Consider removing the futures or using a bounded timeout and only parallelizing the specific actions you want to overlap.
          if (compareDataInSerializationTest) {
            val f1 = Future(t1.transform(transDF))
            val f2 = Future(t2.transform(transDF))
            assertDFEq(Await.result(f1, Duration.Inf), Await.result(f2, Duration.Inf))
          } else {

Comment on lines +482 to +483
// When true, serialization tests compare schemas instead of collecting data.
// Set to true for HTTP service suites where transform() triggers external API calls.
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new compareDataInSerializationTest doc comment is inverted relative to the implementation: when the flag is true the code still calls assertDFEq (collects and compares full data), and when it is false the transformer path compares only transformSchema. Update the comment (and/or rename the flag) so the documented behavior matches the actual logic, especially since the suites in this PR override it to false to avoid HTTP calls.

This issue also appears in the following locations of the same file:

  • line 484
  • line 496
  • line 509
Suggested change
// When true, serialization tests compare schemas instead of collecting data.
// Set to true for HTTP service suites where transform() triggers external API calls.
// When true, serialization tests compare full transformed DataFrames via assertDFEq.
// When false, estimators are only executed (fit/transform) and transformers compare only transformSchema.
// Set to false for HTTP service suites where transform() may trigger external API calls or be non-deterministic.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Doc comment corrected to match implementation: true = full data comparison, false = schema-only comparison.

Skip redundant API calls in serialization round-trip tests for 61 HTTP
service suites across 22 test files. These tests verify save/load/fit/
transform work correctly but don't need to compare output data (which
requires live API calls that are slow and non-deterministic).

- Fuzzing.scala: Transformer path uses transformSchema() instead of
  transform() when compareDataInSerializationTest=false. Estimator path
  verifies save/load/fit/transform succeed without comparing schemas
  (API-backed estimators produce non-deterministic field ordering).
- Remove 14 dead assertDFEq overrides that were unreachable once the
  flag is false, along with unused imports and helper methods.
@BrendanWalsh BrendanWalsh force-pushed the brwals/optimize-serialization-tests branch from 13f8ca1 to 5bd69b6 Compare March 27, 2026 19:08
@BrendanWalsh
Copy link
Copy Markdown
Collaborator Author

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.96%. Comparing base (895752c) to head (5bd69b6).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2525      +/-   ##
==========================================
- Coverage   84.61%   83.96%   -0.66%     
==========================================
  Files         335      335              
  Lines       17708    17708              
  Branches     1612     1612              
==========================================
- Hits        14984    14868     -116     
- Misses       2724     2840     +116     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants