CADIS needs to improve performance to meet demands such as 100 ms ticks.
This ticket is to improve performance for calculating queries in subsets.
Description:
Retrieving sets from the store was improved by maintaining a list of updates per sim. Every attached simulator has its own data structure, keeping track of what new, updated, and deleted objects have changed since last pull.
With subsets, this is not possible. With every pull of a subset, the query needs to be re-executed, and the whole subset is sent back to the client. There are two issues with this approach:
-
Executing the query takes time. In the committed example, EmptyBusiness is a subset of BusinessNode. An EmptyBusiness are all BusinessNodes that has no Person object attached to it (through its foreign key 'EmployedBy'). To calculate this, we need to do:
def query(store):
bns = store.get(BusinessNode, False)
res = []
ppl = store.get(Person, False)
for b in bns:
occupied = False
for p in ppl:
if p.EmployedBy == b.Name:
occupied = True
continue
if not occupied:
res.append(b)
return res
This is a n^2 complexity search, requiring copying two large arrays of objects (BusinessNode and Person). In tests with about 50 Person and BusinessNode objects each, this took at least 200ms to run.
How can we improve this performance? Some suggestions:
- Improve how we copy/construct objects. This would also help performance with store.get of thousands of objects, which also takes seconds to run.
- Cache results and send deltas of the subset
- Assuming subset queries always returns a collection of the same object type, send only IDs on the pull. This would save time sending a lot of data (particular over a network), but would still have the issue of a, requiring copying of thousands of objects in memory at every tick.
-
The simulator might be interested in this set only once or twice. Thus we are recalculating a subset constantly, taking time, and not really using it. To address this problem, patch cc7bd7b added a disabled flag to allow the simulator to disable subset pull, allowing the store to avoid recalculating the query every tick. However this is an inelegant solution. Can we do better?
CADIS needs to improve performance to meet demands such as 100 ms ticks.
This ticket is to improve performance for calculating queries in subsets.
Description:
Retrieving sets from the store was improved by maintaining a list of updates per sim. Every attached simulator has its own data structure, keeping track of what new, updated, and deleted objects have changed since last pull.
With subsets, this is not possible. With every pull of a subset, the query needs to be re-executed, and the whole subset is sent back to the client. There are two issues with this approach:
Executing the query takes time. In the committed example, EmptyBusiness is a subset of BusinessNode. An EmptyBusiness are all BusinessNodes that has no Person object attached to it (through its foreign key 'EmployedBy'). To calculate this, we need to do:
This is a n^2 complexity search, requiring copying two large arrays of objects (BusinessNode and Person). In tests with about 50 Person and BusinessNode objects each, this took at least 200ms to run.
How can we improve this performance? Some suggestions:
The simulator might be interested in this set only once or twice. Thus we are recalculating a subset constantly, taking time, and not really using it. To address this problem, patch cc7bd7b added a disabled flag to allow the simulator to disable subset pull, allowing the store to avoid recalculating the query every tick. However this is an inelegant solution. Can we do better?