Improve the overall speed and memory use of libCellML#1386
Open
agarny wants to merge 3 commits intocellml:mainfrom
Open
Improve the overall speed and memory use of libCellML#1386agarny wants to merge 3 commits intocellml:mainfrom
agarny wants to merge 3 commits intocellml:mainfrom
Conversation
Only support the three most recent versions of macOS, not to mention that we want to be able to use `std::format()` which is only available on macOS 13 and later.
There was a problem hiding this comment.
Pull request overview
This PR focuses on improving libCellML performance and memory usage by removing some regex usage, reducing copying/allocations, adding caches/sets for faster lookups, and switching several internal containers to unordered variants. It also updates tests to cover new XML/Printer and Generator edge cases.
Changes:
- Replace regex-heavy logic with simpler string/character operations and pre-reserve strings/vectors to reduce allocations.
- Introduce caching and set/unordered_set-based membership checks to reduce repeated linear scans during analysis/validation/printing.
- Change a large portion of the public C++ API to return
const std::string&/const std::vector<...>&, and update build settings (C++20 + macOS deployment target).
Reviewed changes
Copilot reviewed 55 out of 56 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/printer/printer.cpp | Adds Printer tests for XML processing-instruction/malformed XML in Math. |
| tests/coverage/coverage.cpp | Adds coverage tests for generator version strings and Printer math edge case. |
| src/xmlnode.h | Replaces hasAttribute() with rawName() API. |
| src/xmlnode.cpp | Optimises element checks; adds rawName(); streamlines attribute access. |
| src/xmldoc.cpp | Removes regex in error callback; uses std::replace for newline cleanup. |
| src/xmlattribute.cpp | Avoids string allocations for namespace comparisons; adds nullptr namespace handling. |
| src/variable.cpp | Reduces lambda captures; removes unnecessary captures; returns string refs for some getters. |
| src/validator.cpp | Replaces repeated linear searches with (un)ordered sets; uses std::format; reduces copies. |
| src/utilities.h | Changes analyserVariables() to return by reference. |
| src/utilities.cpp | Adds reserve() calls; speeds equivalent-variable traversal with a seen set; uses std::format. |
| src/units.cpp | Avoids temporary copies when reading/updating unit definitions. |
| src/reset.cpp | Changes getters to return const std::string& to avoid copies. |
| src/printer.cpp | Removes regex from math printing and connection de-duplication; reduces allocations; adds sets. |
| src/parser.cpp | Reduces copies in namespace validation; adds string reserve for issue text. |
| src/namedentity.cpp | Changes name() to return const std::string&. |
| src/model.cpp | Uses unordered_set to de-dupe import requirements; reduces lambda captures. |
| src/issue.cpp | Changes description() to return const std::string&. |
| src/internaltypes.h | Switches several internal maps to unordered variants (incl. ConnectionMap). |
| src/importsource.cpp | Changes url() to return const std::string&. |
| src/importer.cpp | Speeds library lookups with find/emplace; uses unordered_set for name checks. |
| src/importedentity.cpp | Changes importReference() to return const std::string&. |
| src/generatorvariabletracker.cpp | Avoids intermediate vector copies; reserves result capacity. |
| src/generatorprofiletools.cpp | Reserves profile string buffer. |
| src/generatorprofile.cpp | Changes many getters to return const std::string&. |
| src/generator_p.h | Updates Generator impl APIs and switches dependency tracking to unordered sets. |
| src/entity.cpp | Changes id() to return const std::string&. |
| src/debug.cpp | Reserves output buffer for AST printing. |
| src/componententity.cpp | Reduces lambda captures; changes encapsulationId getter to string ref. |
| src/component.cpp | Reduces lambda captures; changes math getter to string ref. |
| src/commonutils.cpp | Simplifies owningModel traversal using parent chain. |
| src/api/libcellml/variable.h | Public API: returns string refs for initialValue/interfaceType. |
| src/api/libcellml/reset.h | Public API: returns string refs for reset/test values and IDs. |
| src/api/libcellml/namedentity.h | Public API: returns string ref for name(). |
| src/api/libcellml/issue.h | Public API: returns string ref for description(). |
| src/api/libcellml/importsource.h | Public API: returns string ref for url(). |
| src/api/libcellml/importedentity.h | Public API: returns string ref for importReference(). |
| src/api/libcellml/generatorprofile.h | Public API: returns string refs for many profile strings. |
| src/api/libcellml/entity.h | Public API: returns string ref for id(). |
| src/api/libcellml/componententity.h | Public API: returns string ref for encapsulationId(). |
| src/api/libcellml/component.h | Public API: returns string ref for math(). |
| src/api/libcellml/analysermodel.h | Public API: returns vector refs for model collections. |
| src/api/libcellml/analyserexternalvariable.h | Public API: returns vector ref for dependencies. |
| src/api/libcellml/analyserequationast.h | Public API: returns string ref for AST value. |
| src/api/libcellml/analyserequation.h | Public API: returns vector refs for equation collections. |
| src/annotator.cpp | Uses std::format; reduces repeated lookups; reserves buffers for hashing. |
| src/analyservariable.cpp | Reserves vector capacity when materialising weakptr lists. |
| src/analysermodel_p.h | Adds analyser-variable and equivalence caching structures. |
| src/analysermodel.cpp | Returns vectors by ref; adds analyser-variable cache; updates equivalence cache keying. |
| src/analyserexternalvariable.cpp | Reduces lambda captures; returns dependencies by ref. |
| src/analyserequationast.cpp | Returns AST value by ref. |
| src/analyserequation.cpp | Reserves capacities; returns various vectors by ref. |
| src/analyser_p.h | Adds caches and sets for membership; switches maps to unordered variants. |
| src/analyser.cpp | Removes repeated linear scans via sets/caches; uses raw XML names for faster MathML dispatch. |
| CMakeLists.txt | Sets C++20 and raises macOS deployment target. |
| .github/workflows/deploy-on-release.yml | Updates macOS deployment target used for release builds. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #1368.