Hide data representation inside RDKit::Dict#9113
Conversation
Replace direct access to Dict's internal std::vector<Pair> with encapsulated methods: size(), empty(), const iteration via begin()/end(), appendPair(), markNonPOD(), and getRawVal(). This enables future changes to Dict's internal representation without breaking callers. Ref: rdkit#9112
appendPair(Pair&&) now auto-detects non-POD status via RDValue::needsCleanup(), eliminating markNonPOD() and the risk of dangling references or uninitialized entries. needsCleanup() is placed next to destroy() on RDValue to keep the POD/non-POD distinction in one place.
Both callers ignored the output. Non-POD detection is now handled by Dict::appendPair via RDValue::needsCleanup().
| for (unsigned index = 0; index < count; ++index) { | ||
| CHECK_INVARIANT(streamReadProp(ss, dict.getData()[startSz + index], | ||
| dict.getNonPODStatus(), handlers), | ||
| Dict::Pair pair; |
There was a problem hiding this comment.
This is likely to be much less performant since we don't have the call to "resize".
There was a problem hiding this comment.
I've added a new bulk-append method.
void extend(std::vector<Pair> &&pairs) {
for (auto &p : pairs) {
_hasNonPodData |= p.val.needsCleanup();
}
_data.insert(_data.end(), std::make_move_iterator(pairs.begin()),
std::make_move_iterator(pairs.end()));
}
It's still one more allocation than master. Since we are moving elements to a new vec.
| if (propType == handler->getPropName()) { | ||
| handler->read(ss, pair.val); | ||
| dictHasNonPOD = true; | ||
| return true; |
There was a problem hiding this comment.
You don't have CustomTag in the needsCleanup above.
There was a problem hiding this comment.
After review, this might be converted to an AnyTag, we should add a test for this now that we don't have the dictHasNonPod anymore.
There was a problem hiding this comment.
See FooHandler in tesDict.cpp
|
Nice work, I think it's a good idea to hide the implementation. I have a few suggestions and the Custom handlers need to be explicitly tested so that we are sure they are destroyed. |
- Add Dict::append(vector<Pair>&&) for bulk insertion with reserve - Use bulk append in streamReadProps to restore pre-allocation - Rename getRawVal -> getRDValue per reviewer preference - Add test verifying custom AnyTag data is destroyed through Dict lifecycle
|
@greglandrum I think this is worth looking at now. |
Exercises the full streamWriteProps/streamReadProps path with an ExplicitBitVect in an RDProps Dict, confirming the custom handler is invoked and no memory is leaked (verified under valgrind).
5116890 adds a test to exercise the |
greglandrum
left a comment
There was a problem hiding this comment.
One small change requested to the tests.
Otherewise this looks good to me and I agree with Brian that it makes a lot of sense.
| } | ||
| }; | ||
|
|
||
| TEST_CASE("custom AnyTag data is destroyed through Dict lifecycle") { |
There was a problem hiding this comment.
The count tests here pass even if you never call setVal.
I think this would be more precise if you checked the expected value of count instead of simply testing that it's > a target value.
|
@bddap any chance that you can make this change relatively quickly? This one would need to be in a major release and we're doing one of those next week. Otherwise we'll have it on master, but it won't make it out in a release until September. |
|
Thanks for the contribution @bddap! |
implements #9112