Hi, I hope you don't mind me posting these. This too is an actual bug we encounter when migrating from 4.3 to 4.4. Again summarized with Opus 4.8.
in("_id") regresses on databases upgraded to 4.4 — the legacy-_id compatibility handling covers eq/getById but not in
Version
- nitrite: 4.4.0 (regression vs 4.2.x/4.3.x on existing databases)
- store: nitrite-mvstore-adapter 4.4.0 (in-memory MVStore in the repro)
- JDK: tested on 17 and 25
Summary
4.4 moved NitriteId to a long-based representation. Databases written by
earlier versions store the document _id field as a String, so 4.4 carries
explicit backward-compatibility handling to keep reading them — e.g.
NitriteDocument.getId() catches the (long) _id ClassCastException and falls
back to createId((String) _id). Because of this, eq("_id") and getById
keep working after the upgrade (they resolve the id via NitriteId).
in("_id") did not get the same treatment. It compares the raw stored field
value, so after upgrading, in("_id", <id>) returns nothing for rows whose
_id is still a legacy String — even though eq("_id", <id>) on the same rows
returns them. A query that worked before the upgrade (and still works via eq)
silently returns empty via in. This is a partial/incomplete migration of the
_id compatibility handling: mitigated for eq/byId, missed for in.
Root cause
-
The compatibility fallback lives in the id resolution path only —
NitriteDocument.getId():
try {
id = (long) get(DOC_ID);
return createId(id);
} catch (ClassCastException cce) {
if (retrievedId instanceof String) {
return createId((String) retrievedId); // legacy String _id -> tolerated
}
throw new InvalidIdException(...);
}
eq("_id") is routed to a byId lookup by FindOptimizer.planForIdFilter
(nitriteMap.get(createId(...))), so it matches by NitriteId and benefits
from this.
-
in("_id") is not routed to byId (the planner special-cases only
EqualsFilter), and _id has no field index, so it falls to a collection
scan whose InFilter.apply compares the raw field value:
Object fieldValue = document.get(getField()); // legacy _id -> String "3"
if (fieldValue instanceof Comparable)
return comparableSet.contains((Comparable<?>) fieldValue); // Long 3 set has no "3"
No NitriteId/legacy-String handling → miss.
Reproduction (pure 4.4; a legacy String _id is what an upgraded DB contains)
Nitrite db = Nitrite.builder()
.loadModule(MVStoreModule.withConfig().build())
.openOrCreate();
NitriteCollection c = db.getCollection("c");
// legacy-style _id: a numeric String (validId accepts it; this is what pre-4.4 wrote)
c.insert(Document.createDocument("_id", "3").put("name", "m3"));
c.find(where("_id").eq(3L)).size(); // 1 — eq is mitigated, matches
c.getById(NitriteId.createId(3L)) != null; // true — getById is mitigated, matches
c.find(where("_id").in(new Long[]{3L})).size(); // 0 — in is NOT mitigated: regression
Verified against the 4.4.0 release jars: eq/getById match, in matches 0.
Expected
in("_id") honors the same legacy-_id compatibility as eq/getById — i.e.
it is id-aware (NitriteId-based), so upgrading a database to 4.4 does not
silently break in("_id") queries that eq still answers.
Suggested fix
Give in("_id") the same NitriteId-based matching eq/getById already have.
Since in("_id") is always a collection scan (_id has no field index),
InFilter.apply is the only path:
import static org.dizitart.no2.common.Constants.DOC_ID;
...
private final Set<NitriteId> idSet; // non-null only for the _id field
InFilter(String field, Comparable<?>... values) {
super(field, values);
this.comparableSet = new HashSet<>();
Collections.addAll(this.comparableSet, values);
this.idSet = DOC_ID.equals(field) ? new HashSet<>() : null;
if (idSet != null)
for (Comparable<?> v : values)
if (v != null) idSet.add(NitriteId.createId(v.toString()));
}
@Override
public boolean apply(Pair<NitriteId, Document> element) {
if (idSet != null)
return idSet.contains(element.getFirst()); // match by NitriteId, like eq/byId
Object fieldValue = element.getSecond().get(getField());
return fieldValue instanceof Comparable && comparableSet.contains((Comparable<?>) fieldValue);
}
This adds only a one-time set build and a per-row null-check to non-_id in
queries; the indexed applyOnIndex fast path is untouched. It keeps in("_id")
O(N) (as it is today, since _id isn't a field index).
Hi, I hope you don't mind me posting these. This too is an actual bug we encounter when migrating from 4.3 to 4.4. Again summarized with Opus 4.8.
in("_id")regresses on databases upgraded to 4.4 — the legacy-_idcompatibility handling coverseq/getByIdbut notinVersion
Summary
4.4 moved
NitriteIdto along-based representation. Databases written byearlier versions store the document
_idfield as a String, so 4.4 carriesexplicit backward-compatibility handling to keep reading them — e.g.
NitriteDocument.getId()catches the(long) _idClassCastExceptionand fallsback to
createId((String) _id). Because of this,eq("_id")andgetByIdkeep working after the upgrade (they resolve the id via
NitriteId).in("_id")did not get the same treatment. It compares the raw stored fieldvalue, so after upgrading,
in("_id", <id>)returns nothing for rows whose_idis still a legacy String — even thougheq("_id", <id>)on the same rowsreturns them. A query that worked before the upgrade (and still works via
eq)silently returns empty via
in. This is a partial/incomplete migration of the_idcompatibility handling: mitigated foreq/byId, missed forin.Root cause
The compatibility fallback lives in the id resolution path only —
NitriteDocument.getId():eq("_id")is routed to a byId lookup byFindOptimizer.planForIdFilter(
nitriteMap.get(createId(...))), so it matches byNitriteIdand benefitsfrom this.
in("_id")is not routed to byId (the planner special-cases onlyEqualsFilter), and_idhas no field index, so it falls to a collectionscan whose
InFilter.applycompares the raw field value:No
NitriteId/legacy-String handling → miss.Reproduction (pure 4.4; a legacy String
_idis what an upgraded DB contains)Verified against the 4.4.0 release jars:
eq/getByIdmatch,inmatches 0.Expected
in("_id")honors the same legacy-_idcompatibility aseq/getById— i.e.it is id-aware (
NitriteId-based), so upgrading a database to 4.4 does notsilently break
in("_id")queries thateqstill answers.Suggested fix
Give
in("_id")the sameNitriteId-based matchingeq/getByIdalready have.Since
in("_id")is always a collection scan (_idhas no field index),InFilter.applyis the only path:This adds only a one-time set build and a per-row null-check to non-
_idinqueries; the indexed
applyOnIndexfast path is untouched. It keepsin("_id")O(N) (as it is today, since
_idisn't a field index).