Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add caching for and/or and bring ItemsetBinding caching back #742

Merged
merged 5 commits into from
Jan 30, 2024

Conversation

seadowg
Copy link
Member

@seadowg seadowg commented Jan 24, 2024

This also adds the old caching for ItemsetBinding back in. This will mean that the last choice calculation for a question will always be cached for an O(1) lookup. This cache is evaluated when the non static parts of the nodes expression change.

What has been done to verify that this works as intended?

New tests and verified performance improvement for forms with and in Collect (with ./gradlew installLocal.

Why is this the best possible solution? Were any other approaches considered?

As discussed with @lognaturel, long term we'd like to reduce how often Collect calls getSelectChoices() to reduce the need for the caching code in ItemsetBinding. That aside, this level of caching will probably always be advantageous for cases where a data collector is navigating back and forwards to a question that other caching isn't covering.

How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?

The main risk here is that the caching breaks some select choice calculations. It'd be good to try and think of ways this could happen.

@seadowg seadowg marked this pull request as ready for review January 26, 2024 10:44
@seadowg seadowg added this to the v4.3.0 milestone Jan 26, 2024
@lognaturel
Copy link
Member

Here are a few things I believe to be true:

  • EqualityExpressionIndexFilterStrategy means that getting choices for equality expressions is O(1) since Use FilterStrategy optimizations for select choice filters #722. Adding ItemsetBinding caching back means the latest filtered list for each select that uses an equality expression filter is cached twice. Because of the way we build up the cached list in ItemsetBinding, there's no opportunity to share references.
  • without the ItemsetBinding caching, ComparisonExpressionCacheFilterStrategy helps with the multiple getSelectedChoices calls that are immediately triggered by a first call on getSelectedChoices but not, for example, rendering the same select question multiple times. (This is the statement I'm least sure about. @dbemke's testing on on Jan 24 suggests it but it's not totally clear to me how it works)
  • the ItemsetBinding caching sits before the filter strategies so it will always get hits before the filter strategy caches. In other words, the additions to ComparisonExpressionCacheFilterStrategy won't make a difference for selects because that cache only exists for the duration of an evaluation cascade and selects aren't in those cascades.
  • the additions to ComparisonExpressionCacheFilterStrategy will improve performance for boolean expressions with and/or if they are repeated in the evaluation cascade.
  • because ComparisonExpressionCacheFilterStrategy doesn't do anything recursive with XPathBoolExpr, if there's more than one boolean operator, evaluation will always fall back to the raw/linear search approach

Does my understanding seem right? Given all that, it feels like adding and/or and bringing back ItemsetBinding are pretty separate concerns. It might be somewhat helpful for archaeology to have a separate merge/PR bringing back the ItemsetBinding caching. I don't feel very strongly about it, so I'll leave it up to you @seadowg

@seadowg
Copy link
Member Author

seadowg commented Jan 30, 2024

Adding ItemsetBinding caching back means the latest filtered list for each select that uses an equality expression filter is cached twice. Because of the way we build up the cached list in ItemsetBinding, there's no opportunity to share references.

Agreed! Both should be lists of references to the same objects so the additional memory usage is hopefully very limited.

without the ItemsetBinding caching, ComparisonExpressionCacheFilterStrategy helps with the multiple getSelectedChoices calls that are immediately triggered by a first call on getSelectedChoices but not, for example, rendering the same select question multiple times.

Yes. ComparisonExpressionCacheFilterStrategy isn't going to get used very much for selects any longer, other than in cases where two selects share the same expression.

the ItemsetBinding caching sits before the filter strategies so it will always get hits before the filter strategy caches. In other words, the additions to ComparisonExpressionCacheFilterStrategy won't make a difference for selects because that cache only exists for the duration of an evaluation cascade and selects aren't in those cascades.

This isn't quite true as far as I understand? ComparisonExpressionCacheFilterStrategy uses a cache that exists between evaluation cascades. IdempotentExpressionCacheFilterStrategy is maybe what you're thinking of?

the additions to ComparisonExpressionCacheFilterStrategy will improve performance for boolean expressions with and/or if they are repeated in the evaluation cascade.

Yes should do!

because ComparisonExpressionCacheFilterStrategy doesn't do anything recursive with XPathBoolExpr, if there's more than one boolean operator, evaluation will always fall back to the raw/linear search approach

Yeah!

I'll let you look over answers here and then you can merge if it all seems ok!

@lognaturel
Copy link
Member

This isn't quite true as far as I understand? ComparisonExpressionCacheFilterStrategy uses a cache that exists between evaluation cascades. IdempotentExpressionCacheFilterStrategy is maybe what you're thinking of?

Ah yes, sorry. So I took another look at tests with repeats and at the structure of the cache key. I think it looks ok.

@lognaturel lognaturel changed the title Add caching for and/or of two comparison expressions Add caching for and/or and bring ItemsetBinding caching back Jan 30, 2024
@lognaturel lognaturel merged commit c4a82dd into getodk:master Jan 30, 2024
3 checks passed
@seadowg seadowg deleted the caching branch January 30, 2024 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: done
Development

Successfully merging this pull request may close these issues.

None yet

2 participants