Ill-typed upgrades cause data-loss #2692

crusso · 2021-07-29T16:46:24Z

Doing an ill-typed upgrade can cause stable data loss.

The problem is the recent change to candid to support defaulting of optional fields.

Motoko encodes stable variables in a record of options.

Doing an upgrade to an incompatible type will cause those optional fields to default to null, discarding any data those fields might have held (and just running the initializer instead).

A conformance check prior to upgrade could warn about this potential data loss, but absent tooling for that, we are in the situation where users can easily lose their data on upgrade by doing a simple, and natural, mistake like adding a field to stable variable of record type (or, more commonly, map thereof).

#2691 contains a repro of a sequence of ugrades that should fail but don't.

https://forum.dfinity.org/t/questions-about-data-structures-and-migrations/822/15 is a real world example of a user being tripped up by this, trying to "upgrade" a stable array of tuples containing records by adding a field.
Instead of failing the upgrade, the stable variable is regarded as uninitialized and initialized from this initializer expression (losing the data).

Solution: I briefly considered just changing the record of options encoding to a record of variants encoding for stable variables, which would be fine, but isn't backwards compatible with previously compiled code, so a non-starter.

I'll investigate whether we can just change the deserialization behaviour for "T.Memory" records to avoid the option defaulting instead.

@rossberg, @nomeata whaddya think?

Of course, we could say we really need the static check, but I think a defensive implementation is called for anyway.

crusso · 2021-07-30T07:24:42Z

Hmm, for consistency, I wonder if we should just live with this but make the future static checker warn on lossy re-typing of stable vars.

rossberg · 2021-07-30T09:37:17Z

Ouch, yeah. It's really overdue that we implement this upgrade check!

crusso · 2021-07-30T10:13:49Z

Yeah, but it's not entirely clear to me what to check.

Do we output the candidish type and check an extended candidish subtyping relation? Then we need to implement an extended candid just for that.
Or do we use a slightly extended relation on the Motoko (stable) types? If so, we need to serialize the Motoko types (that might involve mu types) to a format we can check. Probably useful for separate compilation later, but definitely not supported right now.
Another alternative, avoiding serialization of types, is just to type check both programs (before and after) in memory and check the relationship between the in memory types (but that seems gross and won't really accommodate language changes).

I think 1) is probably the way to go, since we have most of the machinery already and just need to extend candid a bit.

crusso · 2021-07-30T10:18:52Z

Another question is where to store this info. In a separate file, in a custom section or in the wasm binary as an additonal query method as Joachim hacked up for the (external) Candid interface. The latter actually has some appeal.

FloorLamp · 2021-09-02T19:44:35Z

Hi Claudio, has there been any progress on this issue? I recently lost some state due to this, would be good to prevent this from happening to others!

anthonymq · 2021-11-12T16:07:41Z

Still no progress ? What do you suggest to prevent data loss while we wait for a clean solution ?
I'm coding a bash script to dump the data (only for the canister owner) and restore the data. The thing is that i'm getting candid data and it's a bit hard to parse it and modify if there is a new field in a type.

crusso · 2021-11-14T23:22:32Z

PR #2887 (not yet merged) adds some support for manually checking compatibility of stable signatures using moc.

Roughly,

moc --stable-types foo.mo will write the stable signature of an actor(class) to file foo.most.
moc --stable-compatible old.most new.most will check that the stable interface can evolve form old.most to new.most in a type safe way without unintentional data loss.

You can use tool didc to check that the dids are compatible

didc --check new.did old.did

The separate didc tool can be already be used to manually check compatibility of the candid interfaces.

What remains is integrating this with the replica and dfx to prevent an unsafe ugrade, but that's still underway and may take a while.

I'm also working on some informal examples to explain what is allowed and what isn't #2897.

The basic rules for stable-compatible are that you can evolve

a stable variable to a super-type with the same mutability (so that the upgrade can consume the old value)
add a new stable variable with different name and any type (it takes its value from the initializer when first introduced)

The Candid interfaces can evolve to a Candid subtype (note the inversion of the relation), so that old clients can use the new interface without breaking.

In practice, with the current implementation, one can safely change the mutability of a stable variable in an upgrade and drop an existing stable variable, but we have decided to error on those cases for now (but might only warn in future).

nomeata · 2022-01-10T17:21:11Z

#2887 is merged. Shall we close this?

chenyan-dfinity · 2022-01-10T18:06:36Z

The replica change is still not in production, but we can close this, as it's a motoko issue.

crusso linked a pull request Jul 29, 2021 that will close this issue

Ill-typed upgrades causing data-loss (repro) #2691

Draft

chenyan-dfinity closed this as completed Jan 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ill-typed upgrades cause data-loss #2692

Ill-typed upgrades cause data-loss #2692

crusso commented Jul 29, 2021 •

edited

crusso commented Jul 30, 2021

rossberg commented Jul 30, 2021

crusso commented Jul 30, 2021 •

edited

crusso commented Jul 30, 2021

FloorLamp commented Sep 2, 2021

anthonymq commented Nov 12, 2021

crusso commented Nov 14, 2021 •

edited

nomeata commented Jan 10, 2022

chenyan-dfinity commented Jan 10, 2022

Ill-typed upgrades cause data-loss #2692

Ill-typed upgrades cause data-loss #2692

Comments

crusso commented Jul 29, 2021 • edited

crusso commented Jul 30, 2021

rossberg commented Jul 30, 2021

crusso commented Jul 30, 2021 • edited

crusso commented Jul 30, 2021

FloorLamp commented Sep 2, 2021

anthonymq commented Nov 12, 2021

crusso commented Nov 14, 2021 • edited

nomeata commented Jan 10, 2022

chenyan-dfinity commented Jan 10, 2022

crusso commented Jul 29, 2021 •

edited

crusso commented Jul 30, 2021 •

edited

crusso commented Nov 14, 2021 •

edited