Skip to content

Inconsistency in backwards compatibility with id property #1263

Description

@chris9182

Hi, I hope you don't mind me posting these. This too is an actual bug we encounter when migrating from 4.3 to 4.4. Again summarized with Opus 4.8.

in("_id") regresses on databases upgraded to 4.4 — the legacy-_id compatibility handling covers eq/getById but not in

Version

  • nitrite: 4.4.0 (regression vs 4.2.x/4.3.x on existing databases)
  • store: nitrite-mvstore-adapter 4.4.0 (in-memory MVStore in the repro)
  • JDK: tested on 17 and 25

Summary

4.4 moved NitriteId to a long-based representation. Databases written by
earlier versions store the document _id field as a String, so 4.4 carries
explicit backward-compatibility handling to keep reading them — e.g.
NitriteDocument.getId() catches the (long) _id ClassCastException and falls
back to createId((String) _id). Because of this, eq("_id") and getById
keep working after the upgrade
(they resolve the id via NitriteId).

in("_id") did not get the same treatment. It compares the raw stored field
value, so after upgrading, in("_id", <id>) returns nothing for rows whose
_id is still a legacy String — even though eq("_id", <id>) on the same rows
returns them. A query that worked before the upgrade (and still works via eq)
silently returns empty via in. This is a partial/incomplete migration of the
_id compatibility handling: mitigated for eq/byId, missed for in.

Root cause

  1. The compatibility fallback lives in the id resolution path only —
    NitriteDocument.getId():

    try {
        id = (long) get(DOC_ID);
        return createId(id);
    } catch (ClassCastException cce) {
        if (retrievedId instanceof String) {
            return createId((String) retrievedId);   // legacy String _id -> tolerated
        }
        throw new InvalidIdException(...);
    }

    eq("_id") is routed to a byId lookup by FindOptimizer.planForIdFilter
    (nitriteMap.get(createId(...))), so it matches by NitriteId and benefits
    from this.

  2. in("_id") is not routed to byId (the planner special-cases only
    EqualsFilter), and _id has no field index, so it falls to a collection
    scan whose InFilter.apply compares the raw field value:

    Object fieldValue = document.get(getField());          // legacy _id -> String "3"
    if (fieldValue instanceof Comparable)
        return comparableSet.contains((Comparable<?>) fieldValue);   // Long 3 set has no "3"

    No NitriteId/legacy-String handling → miss.

Reproduction (pure 4.4; a legacy String _id is what an upgraded DB contains)

Nitrite db = Nitrite.builder()
    .loadModule(MVStoreModule.withConfig().build())
    .openOrCreate();
NitriteCollection c = db.getCollection("c");

// legacy-style _id: a numeric String (validId accepts it; this is what pre-4.4 wrote)
c.insert(Document.createDocument("_id", "3").put("name", "m3"));

c.find(where("_id").eq(3L)).size();               // 1  — eq is mitigated, matches
c.getById(NitriteId.createId(3L)) != null;        // true — getById is mitigated, matches
c.find(where("_id").in(new Long[]{3L})).size();   // 0  — in is NOT mitigated: regression

Verified against the 4.4.0 release jars: eq/getById match, in matches 0.

Expected

in("_id") honors the same legacy-_id compatibility as eq/getById — i.e.
it is id-aware (NitriteId-based), so upgrading a database to 4.4 does not
silently break in("_id") queries that eq still answers.

Suggested fix

Give in("_id") the same NitriteId-based matching eq/getById already have.
Since in("_id") is always a collection scan (_id has no field index),
InFilter.apply is the only path:

import static org.dizitart.no2.common.Constants.DOC_ID;
...
private final Set<NitriteId> idSet;   // non-null only for the _id field

InFilter(String field, Comparable<?>... values) {
    super(field, values);
    this.comparableSet = new HashSet<>();
    Collections.addAll(this.comparableSet, values);
    this.idSet = DOC_ID.equals(field) ? new HashSet<>() : null;
    if (idSet != null)
        for (Comparable<?> v : values)
            if (v != null) idSet.add(NitriteId.createId(v.toString()));
}

@Override
public boolean apply(Pair<NitriteId, Document> element) {
    if (idSet != null)
        return idSet.contains(element.getFirst());   // match by NitriteId, like eq/byId
    Object fieldValue = element.getSecond().get(getField());
    return fieldValue instanceof Comparable && comparableSet.contains((Comparable<?>) fieldValue);
}

This adds only a one-time set build and a per-row null-check to non-_id in
queries; the indexed applyOnIndex fast path is untouched. It keeps in("_id")
O(N) (as it is today, since _id isn't a field index).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions