New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize load times and memory footprints of form entry with datasets #5623
Comments
@lognaturel as described in the issue here, I've realized that our recent optimization work didn't touch external secondary instances when used as itemsets. My thinking is that most forms of this kind would include a select on the instance that's later used to calculate things in the form (the latter of which we're optimizing and planning to improve on here). I still think this work is worth doing, but it feels like we need to apply the "predicate filter chain" (or something similar) to itemsets in JavaRosa (and then expand on it in Collect) to really get the speed-ups and memory drops we want? |
First step to clearing "needs discussion" here should be proving/disproving this statement. |
Digging into the above question, it seems that we do have some level of caching applied to choice filters on selects. It is implemented in a completely different way though, the new caching only affects evaluations that go through Passing: Scenario scenario = Scenario.init("Some form", html(
head(
title("Some form"),
model(
mainInstance(t("data id=\"some-form\"",
t("choice"),
t("select")
)),
instance("instance",
item("a", "A"),
item("b", "B")
),
bind("/data/choice").type("string"),
bind("/data/select").type("string")
)
),
body(
input("/data/choice"),
select1Dynamic("/data/select", "instance('instance')/root/item[value = /data/choice]")
)
));
int evaluations = Measure.withMeasure(asList("PredicateEvaluation", "IndexEvaluation"), () -> {
scenario.answer("/data/choice", "a");
scenario.choicesOf("/data/select");
scenario.choicesOf("/data/select");
});
// Check that we do less than (size of secondary instance) * (number of choice lookups)
assertThat(evaluations, lessThan(4)); Failing: Scenario scenario = Scenario.init("Some form", html(
head(
title("Some form"),
model(
mainInstance(t("data id=\"some-form\"",
t("choice"),
t("select")
)),
instance("instance",
item("a", "A"),
item("b", "B")
),
bind("/data/choice").type("string"),
bind("/data/select").type("string")
)
),
body(
input("/data/choice"),
select1Dynamic("/data/select", "instance('instance')/root/item[value = /data/choice]")
)
));
int evaluations = Measure.withMeasure(asList("PredicateEvaluation", "IndexEvaluation"), () -> {
scenario.answer("/data/choice", "a");
scenario.choicesOf("/data/select");
scenario.answer("/data/choice", "b");
scenario.choicesOf("/data/select");
});
// Check that we do less than (size of secondary instance) * (number of choice lookups)
assertThat(evaluations, lessThan(4)); The first test passes because we cache choice list evaluations so that repeated look-ups (with the exact same literal expression) don't need to do anything. This doesn't use our predicate filter concept, so there's currently no way for a client (like Collect) to hook into the choice list evaluation like there is for all other expressions. This isn't the end of the world, but it would make our goal of reducing the memory footprint moot as we'd always load the full instance into memory to calculate choices whether we optimize how other expressions work or not. I think the next step is to spike on replacing the current caching implementation for choice lists with the same |
It does look like we can use our new Now that's worked out, I've run into another issue: if we want to reduce the memory footprint as detailed above (so that sessions after the first only use the memory needed to load the same form without the external instance), we'll run into trouble due to the choice list caching in Regardless of the complexity involved in reworking the existing |
I've pulled this out to a new issue that's not blocked by the above discussion. |
Blocked by #5694Optimize load times
When a form is loaded in Collect calculates are run as part of "initialization" (when
FormDef#initialize
is called). This means that the complexity of calculate expressions affects the for load time. Recent changes to JavaRosa have introduced lazy indexing and caching to some calculate expressions on secondary instances which lets them run faster after this initialization step, but all of these solutions are "in memory" so the speed-up doesn't persist between form entry sessions.We want to make changes to Collect that result in faster subsequent "loads" (probably really initialization) for forms that contain the following restricted set of calculate expressions on external secondary instances that compare nodes in the instance to something else of the following form:
The "relative" part must be a relative path expression, and the "absolute" part can be an absolute path expression, a string literal or a numeric literal.
To measure this, we should use timings on a known slow device (like @lognaturel's trusty test phone). We're aiming for speed-ups (for subsequent loads) of around the order of an O(n) to O(1) complexity here.
Decrease memory footprint
Currently, external secondary instances are loaded into memory with the rest of the form. This is potentially problematic for larger datasets (we often see them exceed hundreds of thousands of records). Along with speeding up load times, we want to reduce the memory footprint of form entry for forms like this (on subsequent loads). This again only needs to apply to forms with our above restricted set of calculate expressions.
To measure this, we should use Android Studio's profiler with any emulator and monitor the total memory footprint of a loaded form with a larger dataset (100,000+ records). We should measure the same form with and without the dataset so that we have a rough idea of the size of the dataset in memory. We are aiming to see the total memory footprint drop so that the dataset's footprint is virtually gone (in subsequent loads).
The text was updated successfully, but these errors were encountered: