Skip to content

JavaScript: account for the fact that our JavaScript bindings use a 32-bit address space#1367

Closed
agarny wants to merge 8 commits intocellml:mainfrom
agarny:fixes
Closed

JavaScript: account for the fact that our JavaScript bindings use a 32-bit address space#1367
agarny wants to merge 8 commits intocellml:mainfrom
agarny:fixes

Conversation

@agarny
Copy link
Copy Markdown
Contributor

@agarny agarny commented Apr 9, 2026

Fixes #1366.

Copilot AI review requested due to automatic review settings April 9, 2026 00:54
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates libCellML’s cached variable-equivalence logic to avoid cache-key collisions in 32-bit address spaces (notably the JavaScript/WASM bindings), addressing incorrect cache hits on very large models (Fixes #1366).

Changes:

  • Replaces the Cantor-pairing-based cache key with a (uintptr_t, uintptr_t) key in an unordered_map for AnalyserModel::areEquivalentVariables().
  • Switches generator initialisation logic to use the AnalyserModel cached areEquivalentVariables() API.
  • Adds (currently disabled/commented-out) large-model regression tests across C++, Python, and JavaScript bindings.

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/analysermodel.cpp Uses a pointer-pair key for cached equivalence results to avoid 32-bit overflow collisions.
src/analysermodel_p.h Introduces VariableKeyPair + custom hash and moves cache storage to unordered_map.
src/generator.cpp Routes equivalence checks through mAnalyserModel->areEquivalentVariables() (cached).
tests/generator/generator.cpp Adds a large-model regression test, but it is commented out.
tests/bindings/python/test_generator.py Adds a large-model regression test, but it is disabled via a triple-quoted block.
tests/bindings/javascript/generator.test.js Adds a large-model regression test, but it is commented out; also introduces unused imports.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/bindings/javascript/generator.test.js Outdated
Comment thread tests/bindings/javascript/generator.test.js
Comment thread tests/bindings/python/test_generator.py
Comment thread tests/generator/generator.cpp
agarny added 2 commits April 9, 2026 13:54
Note that the test is disabled since it performs repeated parse/analyse/generate loops, which take several minutes to complete. So, it should be enabled only when needed, and not as part of our regular test suite.
... rather than the generic areEquivalentVariables() method which doesn't include caching.
Indeed, in our WebAssembly module, the key was 32-bit (while 64-bit in C++/Python) which caused collisions and incorrect cache hits.

So, we replaced the Cantor-pairing key and std::map-based cache with a typed VariableKeyPair and std::unordered_map. This makes the cached-equivalence lookup more explicit, portable, and efficient.
Comment thread src/analysermodel_p.h Outdated

bool operator==(const VariableKeyPair &other) const
{
#ifdef CODE_COVERAGE_ENABLED
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a new thing that we haven't done before? or doing something we've done before in a different way?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CODE_COVERAGE_ENABLED is indeed a new thing. It is needed for our code coverage test (in case the name didn't give it away! :)). We have a line that normally reads:

return first == other.first && second == other.second;

second == other.second is some kind of guard, but if I recall correctly code coverage tells us that it is is never false and that's because if first == other.first is false then second == other.second doesn't get evaluated. So, I use CODE_COVERAGE_ENABLED for the case the code is use during code coverage and, in this case, rather than executing:

return first == other.first && second == other.second;

we execute:

const auto firstEqual = static_cast<uintptr_t>(first == other.first);
const auto secondEqual = static_cast<uintptr_t>(second == other.second);

return firstEqual * secondEqual != 0;

which is the same (albeit less efficient) and has 100% coverage.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am super unkeen to see this type of switching happening. We have avoided this as it simply makes a mockery of claiming that we have 100% line coverage. Have you had a look at the generated machine code to actually see if the second is less efficient (has more machine code), quite often the compiler generates the same code when optimisation is turned on.

Copy link
Copy Markdown
Contributor

@hsorby hsorby Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as my analysis goes we can replace && with &:

// Bitwise test on booleans.
(first == other.first) & (second == other.second)

This will not create a branch in the debug build.

The optimised builds of either way will produce very similar machine code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just checked with https://godbolt.org/ and the generated machine code is different for & and && in debug mode BUT the same in release mode. So, thanks for that, will implement your suggestion.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a significant change that we shouldn’t bury under other changes.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for sure! We should have a single clear PR that makes a change like that isolated from any other code changes.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And do we want that done before this goes through? So that this PR can take advantage of C++20?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't think so. You'd want to update all similar comparisons, right? This particular issue has already been resolved, right? so its not blocking this PR being merged.

Copy link
Copy Markdown
Contributor Author

@agarny agarny Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to leave this PR as-is and have C++20 added in another PR that will also remove those operator==() methods.

Copy link
Copy Markdown
Contributor

@nickerso nickerso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks ok, just trying to understand the introduction and use of the new code coverage define.

nickerso
nickerso previously approved these changes Apr 14, 2026
agarny added 2 commits April 22, 2026 14:33
In debug mode (what we use for our coverage tests), `A & B` will get both `A` and `B` to be evaluated even if `A` is `false`. This means that we get 100% branch coverage.

In release mode, `A & B` will only evaluate `B` if and only if `A` is `true`, which is the kind of optimisation we are after.
@agarny agarny requested review from hsorby and nickerso April 22, 2026 03:39
@hsorby
Copy link
Copy Markdown
Contributor

hsorby commented Apr 24, 2026

For me this looks like a good solution but adding a big file that is not used doesn’t seem to me to be the way to go.
What I read into this is that we want this test but it takes too long to run in a practical development cycle and it is not needed to cover lines of code that are not already covered. I also see that we are showing a need to have some real world examples or tests that are not run as part of the development cycle but perhaps nightly or weekly on the current main branch. We could also add the documentation and tutorial tests into this scheme.

So for me this PR needs to get rid of the big file and the unused/commented out tests (from history as well) and just keep the cache changes.

I will add some other issues to capture these test change idead, where people can add their views in a more appropriate place than this PR comment chain.

@hsorby
Copy link
Copy Markdown
Contributor

hsorby commented Apr 24, 2026

Real world example issue is available under here: #1380. Comments for that idea should be sent there.

@agarny
Copy link
Copy Markdown
Contributor Author

agarny commented Apr 24, 2026

For me this looks like a good solution but adding a big file that is not used doesn’t seem to me to be the way to go. What I read into this is that we want this test but it takes too long to run in a practical development cycle and it is not needed to cover lines of code that are not already covered. I also see that we are showing a need to have some real world examples or tests that are not run as part of the development cycle but perhaps nightly or weekly on the current main branch. We could also add the documentation and tutorial tests into this scheme.

100% agreed.

So for me this PR needs to get rid of the big file and the unused/commented out tests (from history as well) and just keep the cache changes.

Agreed, am going to remove them and force push things.

@agarny
Copy link
Copy Markdown
Contributor Author

agarny commented Apr 24, 2026

Closing this PR in favour of PR #1384. I couldn't "easily" remove the very big CellML file from the Git history. So, faster and easier to create a new PR.

@agarny agarny closed this Apr 24, 2026
@agarny agarny deleted the fixes branch April 24, 2026 02:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

JavaScript: account for the fact that our JavaScript bindings use a 32-bit address space

4 participants