Significant inefficiencies in stitchIntoRecord #12

far-blue · 2019-08-21T22:14:18Z

Hi all,

I'm testing pushing real-world levels of data through a microservice I've written using Atlas and I've noticed some significant inefficiencies in the various stitchIntoRecord methods.

For instance, the OneToOne class method in combination with RegularRelationship's stitchIntoMethods() method basically loops through every nativeRecord in turn and then loops through every foreignRecord to match them up. However, it neither exits early when a match is found nor removed matched foreignRecords. So with 1000 nativeRecords and their associated 1000 foreignRecords you end up with 1,000,000 calls to recordsMatch().

I'd like to suggest that:

The stitchIntoRecord abstract be adjusted to accept the foreignRecords array by reference
The OneToOne method both removes a matched foreignRecord and also breaks the loop early
The ManyToOne method breaks early but doesn't remove the foreignRecord from the array
The OneToMany method removes the matched foreignRecord from the array but doesn't break the loop early.

I don't think there are further optimisations available for the other relationship types.

In the case of OneToOne the result is a change from O(n^2) to O(2n). In testing with 3000 records in my app this changed execution time from over 2 mins to 5 seconds.

Do these changes sound correct and reasonable or have I missed something? If there's no reason against the changes I'm happy to put together a PR.

One point with these changes will be that in the case of duplicate records the behaviour will change. If there are duplicates foreignRecord entries, where currently the last record that matches in the foreignRecords array will always be assigned with my changes the first will be assigned instead for OneToOne and ManyToOne. If there are duplicate nativeRecord entries, where currently all duplicates will receive the same foreignRecord they will now receive different ones and possibly no match will be found for duplicates in the case of OneToOne and OneToMany as the foreignRecord entries are removed as they are consumed.

Thoughts from anyone?

pmjones · 2019-08-21T22:16:23Z

In the case of OneToOne the result is a change from O(n^2) to O(2n). In testing with 3000 records in my app this changed execution time from over 2 mins to 5 seconds.

You had me at O-notation.

If there's no reason against the changes I'm happy to put together a PR.

Please do!

froschdesign · 2019-08-22T08:22:08Z

My suggestion is to use benchmarking and define some scenarios for this case. PhpBench can help here.

far-blue · 2019-08-22T08:28:38Z

The most significant improvements with my suggested changes are with 1-2-1 relationships where my own testing has shown massive improvements but I think a more general approach using common uses of Atlas and profiling for bottlenecks would be great :)

pmjones · 2019-09-01T20:27:25Z

Fixed by #13

far-blue mentioned this issue Aug 22, 2019

Performance optimisations on stitchIntoRecord #13

Merged

pmjones closed this as completed Sep 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Significant inefficiencies in stitchIntoRecord #12

Significant inefficiencies in stitchIntoRecord #12

far-blue commented Aug 21, 2019

pmjones commented Aug 21, 2019

froschdesign commented Aug 22, 2019

far-blue commented Aug 22, 2019

pmjones commented Sep 1, 2019

Significant inefficiencies in stitchIntoRecord #12

Significant inefficiencies in stitchIntoRecord #12

Comments

far-blue commented Aug 21, 2019

pmjones commented Aug 21, 2019

froschdesign commented Aug 22, 2019

far-blue commented Aug 22, 2019

pmjones commented Sep 1, 2019