-
Notifications
You must be signed in to change notification settings - Fork 1
Moving Data Between Backends
What this page covers: copying entities from one Storage to another with StorageTransfer —
the full builder surface, the three ErrorPolicy modes, what the TransferReport tells you (and why
its future never fails for expected errors), the two-argument descriptor(...) for renaming a
collection or changing codec mid-flight, and a maintenance-window cutover playbook.
📌 Note — the transfer only reads the source. It copies, never moves; deleting the source is a separate, explicit action you take afterwards. Re-running is safe (every backend upserts).
import br.com.finalcraft.everydatabase.transfer.*;
TransferReport report = StorageTransfer.builder()
.from(oldLocalFileStorage) // read-only source
.to(newSqlStorage) // write target
.descriptor(PLAYERS) // one or more collections to copy
.descriptor(ACCOUNTS)
.build()
.execute() // CompletableFuture<TransferReport>
.join();
if (report.success()) {
System.out.printf("Done: %d entities in %dms%n", report.totalEntities(), report.durationMs());
} else {
report.errors().forEach(e -> System.err.printf("[%s] %s%n", e.collection(), e.cause().getMessage()));
}That's it: source, target, the descriptors you want copied, execute(). Everything else has a sane
default. execute() returns a CompletableFuture<TransferReport> like every I/O call in the library —
see The Async API; we .join() here for brevity.
💡 Tip — the same
EntityDescriptoryou already use for CRUD is what you register here. The source repository and target repository are derived from it, so the entity type, key extractor and codec are guaranteed to match on both sides.
StorageTransfer lives in everydatabase-core — no extra dependency. Coordinates on Installation.
StorageTransfer.builder() returns a fluent Builder. Mandatory: from, to, and at least one
descriptor. Everything else has a default.
| Method | Default | What it does |
|---|---|---|
from(Storage source) |
— (required) | The storage to read from. Never modified. |
to(Storage target) |
— (required) | The storage to write into. |
descriptor(EntityDescriptor<K,V>) |
— (≥1 required) | A collection to copy (same descriptor both sides). |
descriptor(EntityDescriptor<K,V> src, EntityDescriptor<K,V> dst) |
— | Copy with a rename / codec change (see below). |
batchSize(int) |
500 |
Entities per saveAll on the target. Higher = fewer round-trips, more memory. Must be >= 1. |
errorPolicy(ErrorPolicy) |
FAIL_FAST |
How write failures are handled (see ErrorPolicy). |
applyTargetMigrations(boolean) |
true |
If the target is SchemaAwareStorage, run migrate() during pre-flight. |
failIfTargetCollectionNotEmpty(boolean) |
true |
Abort a collection if the target already has data (count() > 0). |
verifyCounts(boolean) |
true |
After each collection, assert entitiesWritten == sourceCount; fail the report if they diverge. |
progressListener(Consumer<TransferProgress>) |
null |
Called after every batch with a progress snapshot. |
build() |
— | Validates and returns the StorageTransfer. Missing source/target/descriptor throws IllegalStateException. |
A fully-specified transfer:
StorageTransfer transfer = StorageTransfer.builder()
.from(oldLocalFileStorage)
.to(newSqlStorage)
.descriptor(PLAYERS)
.descriptor(ACCOUNTS)
.batchSize(1000)
.applyTargetMigrations(true) // run the target's migrations first
.failIfTargetCollectionNotEmpty(true) // refuse to overwrite existing target data
.verifyCounts(true) // assert written == source count
.errorPolicy(ErrorPolicy.FAIL_FAST)
.progressListener(p ->
System.out.printf("%s: %d/%d (%dms)%n", p.collection(), p.done(), p.total(), p.elapsedMs()))
.build();
TransferReport report = transfer.execute().join();
⚠️ Gotcha —applyTargetMigrations,failIfTargetCollectionNotEmptyandverifyCountsall default totrue(the safe choices). You opt out of the safety rails, you don't opt in. SetfailIfTargetCollectionNotEmpty(false)only when you deliberately want to merge into a populated target — and pair it withErrorPolicy.SKIP_EXISTINGso you don't clobber what's there.
progressListener receives a TransferProgress after each batch: collection(), done() (entities
written so far in this collection), total() (the source count() snapshot taken at start), and
elapsedMs(). Within a collection, done only grows and total is fixed.
import br.com.finalcraft.everydatabase.transfer.ErrorPolicy;| Policy | On a write failure | Target safety | Speed |
|---|---|---|---|
FAIL_FAST (default)
|
first exception aborts the whole transfer | unstarted collections untouched | batched |
CONTINUE |
record the error, keep going with remaining batches/collections | best-effort; partial writes possible | batched |
SKIP_EXISTING |
write entity-by-entity, exists()-check each key, skip ones already present |
never overwrites existing target data | slower (2 round-trips/entity) |
-
FAIL_FASTis the safest default: it stops the moment anything goes wrong, so a partial, inconsistent target can't silently form. The report carries exactly oneTransferError. -
CONTINUEis "best effort, inspect afterwards." All failures land inreport.errors();report.success()isfalseif any error was recorded. -
SKIP_EXISTINGis the non-destructive merge mode. It abandons the batch path and writes one entity at a time, callingRepository.exists(key)first and skipping keys already on the target. Use it withfailIfTargetCollectionNotEmpty(false)to fold new data into a populated collection without touching what's there.
// Merge new players into an already-populated target, preserving existing rows:
StorageTransfer.builder()
.from(src).to(dst)
.descriptor(PLAYERS)
.failIfTargetCollectionNotEmpty(false) // the target already has data — allow it
.errorPolicy(ErrorPolicy.SKIP_EXISTING) // ...but only add keys that aren't there yet
.build()
.execute()
.join();📌 Note — under
SKIP_EXISTING,verifyCountsrelaxes its check fromentitiesWritten == sourceCounttoentitiesWritten <= sourceCount, because skipped entities are expected and should not flag the report as failed.
This is the contract to internalize: execute()'s future completes normally even when the transfer
itself failed. Expected failures (a write blew up, a count mismatched, the target collection wasn't
empty) are data, collected inside the report — not exceptions thrown out of the future. An exception
escaping the future means an unexpected JVM-level failure, not a transfer error.
So you always check report.success():
TransferReport report = transfer.execute().join(); // does NOT throw for a failed transfer
if (report.success()) {
System.out.printf("Transferred %d entities across %d collections in %dms%n",
report.totalEntities(), report.collections().size(), report.durationMs());
for (CollectionStats s : report.collections().values()) {
System.out.printf(" %s -> %s: %d/%d written, target %d -> %d, %dms%n",
s.sourceCollection(), s.targetCollection(),
s.entitiesWritten(), s.sourceCount(),
s.targetCountBefore(), s.targetCountAfter(), s.durationMs());
}
} else {
for (TransferError e : report.errors()) {
System.err.printf("[%s] key=%s: %s%n", e.collection(), e.key(), e.cause().getMessage());
}
}What the report exposes:
| Member | Meaning |
|---|---|
success() |
true only if no error was recorded and all count verifications passed. |
totalEntities() |
Sum of entitiesWritten() across every collection. |
durationMs() |
Wall-clock from the first pre-flight check to the last verification. |
collections() |
Map<String, CollectionStats> keyed by source collection name, in registration order. |
errors() |
List<TransferError>; empty on success. With FAIL_FAST, at most one entry. |
Each CollectionStats carries sourceCollection(), targetCollection(), sourceCount(),
targetCountBefore(), targetCountAfter(), entitiesWritten(), and durationMs() — enough to verify
a clean copy: targetCountAfter() - targetCountBefore() should equal sourceCount() (or be smaller
under SKIP_EXISTING).
A TransferError carries collection(), key() (the entity key, or null for a global/collection-level
failure like a pre-flight abort or count mismatch), and cause() (the Throwable).
⚠️ Gotcha — don't writetry { transfer.execute().join(); } catch (...) { /* handle failure */ }and expect to catch a failed transfer there. The future succeeds; the failure is in the report. A thrown exception is a bug-level surprise, not a transfer error.
The two-argument descriptor(sourceDescriptor, targetDescriptor) decouples how the entity is read from
how it's written. Both descriptors must share the same <K, V> types so a decoded source entity can be
written to the target without conversion — but their collection name and codec may differ.
Two real uses:
// 1. Rename the collection during the move ("legacy_players" on disk -> "players" in SQL).
EntityDescriptor<UUID, PlayerData> SRC = EntityDescriptor.builder(UUID.class, PlayerData.class)
.collection("legacy_players")
.keyExtractor(PlayerData::getUuid)
.codec(JacksonJsonCodec.pretty(PlayerData.class)) // human-readable (indented) JSON on the old file store
.build();
EntityDescriptor<UUID, PlayerData> DST = EntityDescriptor.builder(UUID.class, PlayerData.class)
.collection("players")
.keyExtractor(PlayerData::getUuid)
.codec(new JacksonJsonCodec<>(PlayerData.class)) // compact JSON for the SQL column
.build();
StorageTransfer.builder()
.from(oldLocalFileStorage)
.to(newSqlStorage)
.descriptor(SRC, DST) // read SRC, write DST
.build()
.execute()
.join();// 2. Change codec only (YAML files -> JSON in SQL), same collection name on both sides.
// The source descriptor uses JacksonYamlCodec (LocalFile-only); the target uses JSON.
.descriptor(yamlPlayersDesc, jsonPlayersDesc)📌 Note — a codec change like this is exactly why the per-side descriptor exists: SQL/Mongo/InMemory require a JSON codec, while only LocalFile accepts YAML. Read with the source-appropriate codec, write with the target-appropriate one. See Codecs and Choosing a Backend.
StorageTransfer is built for a maintenance-mode migration — the kind where you freeze writes, copy the
data, then point the application at the new backend on restart. A typical run:
-
Freeze writes. Put the application in maintenance mode (whitelist / read-only) and flush any in-memory caches so the source on disk is the source of truth. (If you use the manager module, this is where you stop write-through traffic — see Caching & References.)
-
Open both storages and
init()them. SameEntityDescriptors you use in production. -
Run the transfer with the safety rails on (the defaults):
TransferReport report = StorageTransfer.builder() .from(currentStorage) .to(newStorage) .descriptor(PLAYERS) .descriptor(ACCOUNTS) .applyTargetMigrations(true) // bring the target schema up first .failIfTargetCollectionNotEmpty(true) // a fresh target must be empty .verifyCounts(true) // counts must match exactly .errorPolicy(ErrorPolicy.FAIL_FAST) // stop on the first problem .progressListener(p -> log(p.collection(), p.done(), p.total())) .build() .execute() .join();
-
Gate the cutover on
report.success(). Iffalse, inspectreport.errors(), fix the cause, and re-run — the target is still empty (FAIL_FAST left unstarted collections untouched), or you can wipe and retry. Do not flip the application over on a failed report. -
Cut over. On a successful report, change the application's storage config to the new backend and restart. The source remains intact as a rollback safety net until you're confident.
💡 Tip — to resume into a partially-populated target instead of starting fresh (e.g. a previous run died midway), flip to
failIfTargetCollectionNotEmpty(false)+ErrorPolicy.SKIP_EXISTING: it only adds the keys that aren't there yet, leaving already-copied entities untouched.
📌 Note — transfer activity is observable. The
TRANSFERlog topic emits begin / per-collection progress / completion events, and the transfer mirrors those onto the target storage's log config. Turn them up withStorageLogConfig.defaults().level(StorageLogTopic.TRANSFER, StorageLogLevel.INFO)— see Logging & Diagnostics.
-
The Async API —
execute()returns aCompletableFuture; composition and.join()semantics. - Choosing a Backend — pick the source and target; data-at-rest formats and capability matrix.
-
Codecs — JSON vs YAML, why the two-arg
descriptorlets you change codec mid-transfer. -
Schema Migrations — what
applyTargetMigrations(true)runs before the copy. -
Logging & Diagnostics — the
TRANSFERtopic and how to watch a transfer's progress. -
CRUD Operations — the
saveAll/existssemantics the transfer is built on. - Gotchas & Pitfalls — the "future never fails" contract and the default-on safety rails.
EveryDatabase · Home · made by Petrus Pradella
Getting Started
Core Concepts
Working with Data
Backends
Manager Module
- Caching & References
- Typed References (Ref)
- Caching Managers
- Cache Policies & Freshness
- Cross-Process Cache Sync
- One Entity, Many Databases
Operations
Advanced
Reference
Contributing