JavaScript: account for the fact that our JavaScript bindings use a 32-bit address space#1367
JavaScript: account for the fact that our JavaScript bindings use a 32-bit address space#1367agarny wants to merge 8 commits intocellml:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Updates libCellML’s cached variable-equivalence logic to avoid cache-key collisions in 32-bit address spaces (notably the JavaScript/WASM bindings), addressing incorrect cache hits on very large models (Fixes #1366).
Changes:
- Replaces the Cantor-pairing-based cache key with a
(uintptr_t, uintptr_t)key in anunordered_mapforAnalyserModel::areEquivalentVariables(). - Switches generator initialisation logic to use the
AnalyserModelcachedareEquivalentVariables()API. - Adds (currently disabled/commented-out) large-model regression tests across C++, Python, and JavaScript bindings.
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
src/analysermodel.cpp |
Uses a pointer-pair key for cached equivalence results to avoid 32-bit overflow collisions. |
src/analysermodel_p.h |
Introduces VariableKeyPair + custom hash and moves cache storage to unordered_map. |
src/generator.cpp |
Routes equivalence checks through mAnalyserModel->areEquivalentVariables() (cached). |
tests/generator/generator.cpp |
Adds a large-model regression test, but it is commented out. |
tests/bindings/python/test_generator.py |
Adds a large-model regression test, but it is disabled via a triple-quoted block. |
tests/bindings/javascript/generator.test.js |
Adds a large-model regression test, but it is commented out; also introduces unused imports. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Note that the test is disabled since it performs repeated parse/analyse/generate loops, which take several minutes to complete. So, it should be enabled only when needed, and not as part of our regular test suite.
... rather than the generic areEquivalentVariables() method which doesn't include caching.
Indeed, in our WebAssembly module, the key was 32-bit (while 64-bit in C++/Python) which caused collisions and incorrect cache hits. So, we replaced the Cantor-pairing key and std::map-based cache with a typed VariableKeyPair and std::unordered_map. This makes the cached-equivalence lookup more explicit, portable, and efficient.
|
|
||
| bool operator==(const VariableKeyPair &other) const | ||
| { | ||
| #ifdef CODE_COVERAGE_ENABLED |
There was a problem hiding this comment.
is this a new thing that we haven't done before? or doing something we've done before in a different way?
There was a problem hiding this comment.
CODE_COVERAGE_ENABLED is indeed a new thing. It is needed for our code coverage test (in case the name didn't give it away! :)). We have a line that normally reads:
return first == other.first && second == other.second;second == other.second is some kind of guard, but if I recall correctly code coverage tells us that it is is never false and that's because if first == other.first is false then second == other.second doesn't get evaluated. So, I use CODE_COVERAGE_ENABLED for the case the code is use during code coverage and, in this case, rather than executing:
return first == other.first && second == other.second;we execute:
const auto firstEqual = static_cast<uintptr_t>(first == other.first);
const auto secondEqual = static_cast<uintptr_t>(second == other.second);
return firstEqual * secondEqual != 0;which is the same (albeit less efficient) and has 100% coverage.
There was a problem hiding this comment.
I am super unkeen to see this type of switching happening. We have avoided this as it simply makes a mockery of claiming that we have 100% line coverage. Have you had a look at the generated machine code to actually see if the second is less efficient (has more machine code), quite often the compiler generates the same code when optimisation is turned on.
There was a problem hiding this comment.
As far as my analysis goes we can replace && with &:
// Bitwise test on booleans.
(first == other.first) & (second == other.second)
This will not create a branch in the debug build.
The optimised builds of either way will produce very similar machine code.
There was a problem hiding this comment.
I have just checked with https://godbolt.org/ and the generated machine code is different for & and && in debug mode BUT the same in release mode. So, thanks for that, will implement your suggestion.
There was a problem hiding this comment.
It seems like a significant change that we shouldn’t bury under other changes.
There was a problem hiding this comment.
for sure! We should have a single clear PR that makes a change like that isolated from any other code changes.
There was a problem hiding this comment.
And do we want that done before this goes through? So that this PR can take advantage of C++20?
There was a problem hiding this comment.
I wouldn't think so. You'd want to update all similar comparisons, right? This particular issue has already been resolved, right? so its not blocking this PR being merged.
There was a problem hiding this comment.
Happy to leave this PR as-is and have C++20 added in another PR that will also remove those operator==() methods.
nickerso
left a comment
There was a problem hiding this comment.
I think this looks ok, just trying to understand the introduction and use of the new code coverage define.
In debug mode (what we use for our coverage tests), `A & B` will get both `A` and `B` to be evaluated even if `A` is `false`. This means that we get 100% branch coverage. In release mode, `A & B` will only evaluate `B` if and only if `A` is `true`, which is the kind of optimisation we are after.
|
For me this looks like a good solution but adding a big file that is not used doesn’t seem to me to be the way to go. So for me this PR needs to get rid of the big file and the unused/commented out tests (from history as well) and just keep the cache changes. I will add some other issues to capture these test change idead, where people can add their views in a more appropriate place than this PR comment chain. |
|
Real world example issue is available under here: #1380. Comments for that idea should be sent there. |
100% agreed.
Agreed, am going to remove them and force push things. |
|
Closing this PR in favour of PR #1384. I couldn't "easily" remove the very big CellML file from the Git history. So, faster and easier to create a new PR. |
Fixes #1366.