Allow disambiguating duplicated note model UUIDs in collection #155

aplaice · 2021-12-05T22:16:35Z

The issue of note model UUIDs being cloned along with the note models themselves has been fixed (#136). However, many users' collections will already contain such duplicated note model UUIDs. Also, if a user clones a note type on another platform, then the UUID will still be duplicated.

The disambiguation is run before export and snapshot, since that's when it's most needed to avoid broken deck.jsons.

In the long, run we could perhaps switch to running the code only after syncing (since our "attack surface" would be cloning of note models on other platforms).

Running disambiguate_note_model_uuids takes 10 ms on a collection with 20 note types (without duplicates), which is IMO an acceptable overhead.

The file disambiguate_uuids.py could also contain the (far lower priority) disambiguation of deck config UUIDs (see #135).

I've chosen to set a new UUID immediately (rather than just removing it and having it be regenerated when it's needed (possibly during the export/snapshot immediately after), which would work just as well) because IMO it's more comprehensible to the end-user — the new and old UUIDs are both listed together and they can inspect their deck.jsons.

The key question is how to determine which note model is "the original". I think that using the note type id (notetypes.id) is a sufficiently good proxy.

Details

When creating a new note model in Anki (e.g. via cloning), the id of the new note type is the time in milliseconds since the epoch, unless that id is already taken, in which case it's that time + 1. In either case, the id of any newer note model will be greater than the id of any older note models. AFAICT AnkiDroid uses the same rust backend, so its behaviour should be the same. AnkiMobile generally reuses Anki's code, so it should also behave the same way.

(I haven't dug into the old Anki python version of the code, but from what I've read online, it's "always" been the case that the notetype should be the unix time (in ms). Looking at some of my own old (built-in) note types, their ids correspond to the time when I started using Anki, further supporting the hypothesis.)

AFAICT based on skimming the code and testing, updating an existing note type (even changing the number of fields and the number of cards) does not change its id (which feels logical).

People might also generate model ids outside Anki, for instance using genanki. Genanki's recommendation is to generate a random integer between 2^31 and 2^32. Since all unix times (in ms) from this millenium are greater than 2^32, then assuming that people followed genanki's recommendations, then the ids of any genanki note models will be smaller than the ids of any note models created from these note models (by cloning) in Anki, so again ordering of id maps onto newness, well.

There are some special cases, for instance when upgrading a collection which had a note model id of 0, but here again the generated id is less than 2^32, so it will also be smaller than any time-based id.

Summary

Overall, while it's in principle possible that the cloned copy of a note model will have a smaller id than the original note model, it seems highly unlikely.

The issue of note model UUIDs being cloned along with the note models themselves has been fixed (Stvad#136). However, many users' collections will already contain such duplicated note model UUIDs. Also, if a user clones a note type on another platform, then the UUID will still be duplicated. The disambiguation is run before export and snapshot, since that's when it's most needed to avoid broken `deck.json`s. In the long, run we could perhaps switch to running the code only after syncing (since our "attack surface" would be cloning of note models on other platforms). Running `disambiguate_note_model_uuids` takes 10 ms on a collection with 20 note types (without duplicates), which is IMO an acceptable overhead. The file `disambiguate_uuids.py` could also contain the (far lower priority) disambiguation of deck config UUIDs (see Stvad#135).

aplaice added 2 commits December 5, 2021 23:14

Add test for note model UUID de-duplication

1bd653e

aplaice merged commit 5f79ca9 into Stvad:master Dec 11, 2021

aplaice deleted the disambiguate_note_model_fix branch December 11, 2021 13:07

aplaice mentioned this pull request Dec 11, 2021

Avoid duplicating crowdanki_uuid when cloning deck config #135

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow disambiguating duplicated note model UUIDs in collection #155

Allow disambiguating duplicated note model UUIDs in collection #155

aplaice commented Dec 5, 2021 •

edited

Loading

Allow disambiguating duplicated note model UUIDs in collection #155

Allow disambiguating duplicated note model UUIDs in collection #155

Conversation

aplaice commented Dec 5, 2021 • edited Loading

Details

Summary

aplaice commented Dec 5, 2021 •

edited

Loading