Skip to content

add unit test coverage for record-storage cluster (Empty / VFS / Sequential) #5446

@aglinxinyuan

Description

@aglinxinyuan

Background

Three modules in engine/common/storage currently lack a dedicated unit-spec:

Source class Package Purpose
SequentialRecordStorage org.apache.texera.amber.engine.common.storage Abstract sequential-record reader/writer + getStorage factory
VFSRecordStorage (same) Apache Commons VFS concrete implementation
EmptyRecordStorage (same) Null-object implementation (no-op writer / EOF reader / always-false containsFolder)

All three are reachable from production code (SequentialRecordStorage.getStorage is the factory used by checkpoint logging) but none have characterization tests. A regression in any of these would only surface as a downstream serde / replay failure.

What we want pinned

Behavior we want to lock in:

Area Contract
SequentialRecordStorage.getStorage(None) returns an EmptyRecordStorage
SequentialRecordStorage.getStorage(Some(file://…)) returns a VFSRecordStorage
SequentialRecordStorage.getStorage(Some(hdfs://…)) dispatches to HDFSRecordStorage (path covered without actually opening an HDFS connection by asserting the constructor blows up on a non-resolvable host rather than silently returning VFSRecordStorage)
SequentialRecordWriter / SequentialRecordReader round-trip a sequence of records through AmberRuntime.serde (size-prefixed framing)
SequentialRecordStorage.fetchAllRecords iterates all records returned by the underlying reader
VFSRecordStorage constructor auto-creates the target folder when it does not exist
VFSRecordStorage.getWriter / getReader round-trip a record through a local file:// URI
VFSRecordStorage.deleteStorage removes the on-disk folder created by the constructor
VFSRecordStorage.containsFolder distinguishes existing folder vs. existing file vs. missing entry
EmptyRecordStorage.getWriter returns a writer backed by NullOutputStream (writes are silently discarded)
EmptyRecordStorage.getReader returns a reader that yields zero records
EmptyRecordStorage.deleteStorage / containsFolder are no-op and always-false respectively

Scope

  • New spec files (one per source class per the spec-filename convention):
    • SequentialRecordStorageSpec.scala
    • VFSRecordStorageSpec.scala
    • EmptyRecordStorageSpec.scala
  • No production-code changes.
  • Tests use the production wire path (AmberRuntime.serde) the same way CheckpointSubsystemSpec / ClientEventSpec do (a suite-local ActorSystem injected into AmberRuntime via reflection, torn down in afterAll).

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No fields configured for Task.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions