Skip to content

[protocol] Sanitize em dashes and add store-level storageMode in v44 schema#2822

Merged
ymuppala merged 2 commits into
linkedin:mainfrom
sixpluszero:jlliu/fix-em-dash-in-v44-v99-schemas
May 23, 2026
Merged

[protocol] Sanitize em dashes and add store-level storageMode in v44 schema#2822
ymuppala merged 2 commits into
linkedin:mainfrom
sixpluszero:jlliu/fix-em-dash-in-v44-v99-schemas

Conversation

@sixpluszero
Copy link
Copy Markdown
Contributor

@sixpluszero sixpluszero commented May 23, 2026

Problem Statement

Two issues with the StoreMetaValue v44 / AdminOperation v99 schemas added by #2806 / #2814:

Issue 1: em-dash characters break v44 / v99 activation

The doc fields use Unicode em dashes (, U+2014). The schemas survive an in-process parse->toString round-trip but not the HTTP path used by ControllerClientBackedSystemSchemaInitializer — em dashes arrive at the controller as ?. When the controller's duplicate-detection logic in HelixReadWriteSchemaRepository.getNextAvailableSchemaId then compares the just-registered v44 (with em dashes) against the new POST body (with ?), Schema.equals returns true but AvroSchemaUtils.hasDocFieldChange returns true. The POST is treated as a new schema, returns next available id is 45 instead of DUPLICATE_VALUE_SCHEMA_CODE, and the controller throws:

VeniceException: Inconsistent value schema id between the caller and the local schema repository.
Expected new schema id of 44 but the next available id from the local repository is 45 ...

The client retries this for ~40s before it eventually re-reads the ZK snapshot. This pushed VeniceClientCompatibilityTest past its 60s cluster-ready budget, manifesting as Connection refused to the router on 5 consecutive CI runs of #2817.

v1..v43 use only ASCII in their doc strings, so the HTTP round-trip is lossless. The em dashes appear for the first time on v44.

Issue 2: store-level storageMode was missing from StoreProperties

The v99 UpdateStore admin op already carries storageMode as an operator-override, but its doc said the controller only copies the value into StoreVersion.storageMode for new versions. There was no persistent store-level slot for storageMode on StoreProperties. That meant the store-level default could not be remembered between version creations; the operator would need to repeat the admin op on every new version. StoreProperties did have a externalStorageReadMode field but not a storageMode field, which is inconsistent with how the two external-storage knobs are intended to work together at the store level.

Solution

  1. Replace the four em dashes in v44 / v99 doc strings with ASCII double hyphens (--). Two files, four lines.
  2. Add storageMode (int, default 0) to StoreProperties in v44, alongside the existing externalStorageReadMode. Same encoding as StoreVersion.storageMode and UpdateStore.storageMode (0 = INTERNAL, 1 = DUAL_WRITE, 2 = EXTERNAL). Update UpdateStore.storageMode doc in v99 to call out that the controller now also persists the value on the store record.

The underlying charset bug (SchemaEntry.getSchemaBytes() using platform default charset, HTTP form-param encoding not pinned to UTF-8) is intentionally left for a follow-up PR; restricting doc strings to ASCII is the minimal change that unblocks v44 / v99 activation.

Code changes

  • Added new code behind a config. (no)
  • Introduced new log lines. (no)

Concurrency-Specific Checks

  • No race conditions or thread safety issues. (schema-only change)
  • No new synchronization mechanisms required.
  • No blocking calls introduced.
  • No thread-safe collections needed.
  • No multi-threaded code added.

How was this PR tested?

  • Reproduced the failure locally on PR [venice-common][controller][compact] Activate StoreMetaValue v44 / AdminOperation v99 + Java accessors #2817 (which activates v44 / v99 by removing versionOverrides):
    • schema_id=44 POST attempts: 505, Inconsistent value schema id errors: 405, setUp FAILED (Connection refused to router after 60s) for every Avro client version (1.4 .. 1.10). Total: build fails after ~5 min.
  • Applied this PR's em-dash fix on top of [venice-common][controller][compact] Activate StoreMetaValue v44 / AdminOperation v99 + Java accessors #2817 and re-ran :internal:venice-avro-compatibility-test:testAvro1_10: schema_id=44 POSTs 4 (normal), Inconsistent errors 0, all 6 tests PASSED, build completed in 40 s.
  • Re-ran the same test after the StoreProperties.storageMode addition: Inconsistent errors 0, all 6 tests PASSED, build completed in 54 s.
  • Verified backward compatibility: the em-dash fix touches only doc text (no impact on serialization). The new storageMode field on StoreProperties has default: 0, so v43 readers ignore it and writers that never set it produce records that v43 readers decode unchanged.

Does this PR introduce any user-facing or breaking changes?

  • No. Avro doc is metadata-only. The new storageMode field on StoreProperties defaults to 0 (INTERNAL, i.e. current Venice-only behavior), so stores that do not opt in see no change.

The StoreMetaValue v44 / AdminOperation v99 doc strings use Unicode em
dashes (U+2014). The schemas survive a parse->toString round-trip, but
not the HTTP path used by ControllerClientBackedSystemSchemaInitializer:
the bytes arrive at the controller as '?', so the in-process and HTTP
registrations of v44 disagree on doc text. The controller's duplicate
check treats this as a new schema, returns "next available id is 45",
and the client retries the POST for ~40s -- enough to push cluster
bring-up past the 60s setUp timeout in VeniceClientCompatibilityTest
and consistently fail v44 / v99 activation.

Restricting the new doc strings to ASCII unblocks activation without
touching the underlying charset bug in SchemaEntry.getSchemaBytes()
and the HTTP form-param encoding, which can be fixed independently.
Copilot AI review requested due to automatic review settings May 23, 2026 08:23
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes Unicode em dashes (U+2014) from Avro schema doc strings in the staged protocol versions StoreMetaValue v44 and AdminOperation v99, replacing them with ASCII -- to avoid schema-registration mismatches caused by a lossy HTTP encoding path during controller/server bootstrap.

Changes:

  • Replaced em dashes with ASCII -- in AdminOperation v99 doc strings (2 occurrences).
  • Replaced em dashes with ASCII -- in StoreMetaValue v44 doc strings (2 occurrences).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
services/venice-controller/src/main/resources/avro/AdminOperation/v99/AdminOperation.avsc Replaces non-ASCII em dashes in doc strings with ASCII to keep schema text identical across registration paths.
internal/venice-common/src/main/resources/avro/StoreMetaValue/v44/StoreMetaValue.avsc Replaces non-ASCII em dashes in doc strings with ASCII to prevent duplicate-detection mismatches when activating v44.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Mirrors the existing UpdateStore.storageMode admin-op field with a
persisted counterpart on the store record. UpdateStore.storageMode
previously had nowhere to land at the store level; the controller could
only forward the value into each newly-created StoreVersion.storageMode.
StoreProperties.storageMode now lets the controller remember the
store-level default so future versions inherit it without the operator
having to repeat the admin op.

Same int encoding as UpdateStore / StoreVersion (0=INTERNAL,
1=DUAL_WRITE, 2=EXTERNAL) with default 0 for forward compatibility with
v43 readers. Doc on UpdateStore.storageMode updated to call out the new
persistence path.
@sixpluszero sixpluszero changed the title [protocol] Replace em dashes with ASCII in v44/v99 schema docs [protocol] Sanitize em dashes and add store-level storageMode in v44 schema May 23, 2026
Copy link
Copy Markdown
Collaborator

@ymuppala ymuppala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@ymuppala ymuppala merged commit 9a111ba into linkedin:main May 23, 2026
317 of 329 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants