Speed up record inclusion check. #6289

cristianoc · 2023-06-06T10:44:45Z

Speed up record inclusion check.
Fixes #6284

Record inclusion check (between implementation and interface) is quadratic.

Example:

module M : {
  type t<'a, 'b, 'c> = {x:int, y:list<('a, 'b)>, z:int}
} = {
  type t<'a, 'b, 'c> = {x:int, y:list<('a, 'c)>, z:int}
}

The existing algorithm tries to instantiate type parameters. It only reports an error if there is an inconsistency. This requires solving type equations involving many types at once.

To give an accurate error message, the first problematic field is reported (in this case y). So the type equations are checked again and again with size 1, 2, ...n where n is the number of fields. (Plus the type parameters).

This is quadratic and is problematic for records of ~1K fields.

This PR provides a fast path which just checks if there is an error, without blaming a specific field. The fast path is linear.
Only if an error is detected, the quadratic path is take to blame precisely which field is involved.

mununki

Very clever!
This PR actually makes me wonder why upstream does Ctype.equal nested. Obviously, for a record with many fields, it would be very slow.

cknitt · 2023-06-06T15:18:14Z

Great work! 🎉

There is another place where compare_records is invoked (in compare_constructor_arguments). Should that be adapted, too?

zth · 2023-06-06T20:20:23Z

@mununki could you check what https://github.com/mununki/benchmark-rescript-records gives when running this PR?

mununki · 2023-06-07T04:17:20Z

@mununki could you check what mununki/benchmark-rescript-records gives when running this PR?

It's clear that performance has improved under the same conditions.

# before
Normal record x 7.29 ops/sec ±0.96% (23 runs sampled)
Record with spread x 5.30 ops/sec ±1.20% (18 runs sampled)
Record with optional x 5.08 ops/sec ±3.61% (17 runs sampled)
Record with spread and optional x 3.37 ops/sec ±0.96% (13 runs sampled)

# after
Normal record x 8.55 ops/sec ±0.83% (26 runs sampled)
Record with spread x 6.57 ops/sec ±2.79% (21 runs sampled)
Record with optional x 7.97 ops/sec ±1.18% (25 runs sampled)
Record with spread and optional x 6.40 ops/sec ±0.46% (20 runs sampled)

Speed up record inclusion check. Fixes #6284 Record inclusion check (between implementation and interface) is quadratic. Example: ```res module M : { type t<'a, 'b, 'c> = {x:list<('a, 'b)>, y:int, z:int} } = { type t<'a, 'b, 'c> = {x:list<('a, 'c)>, y:int, z:int} } ``` The algorithm tries to instantiate type parameters. It only reports an error if there is an inconsistency. This requires solving type equations involving many types at once. To improve error message, the first problematic field is reported. So the type equations are checked again and again with size 1, 2, ...n where n is the number of fields. (Plus the type parameters). This is quadratic and is problematic for types of ~1K elements. This PR provides a fast path which just checks if there is an error, without blaming a specific field. The fast path is linear. Only if an error is detected, the quadratic path is take to blame precisely which field is involved.

So that there's some minimal sanity check that record type inclusion works as expected on a nontrivial case.

cristianoc · 2023-06-07T07:45:30Z

Great work! 🎉

There is another place where compare_records is invoked (in compare_constructor_arguments). Should that be adapted, too?

Refactored so there's a single function now.

cristianoc · 2023-06-07T07:49:13Z

One alternative would be to make this internal function incremental:

let eqtype_list rename type_pairs subst env tl1 tl2 =
  univar_pairs := [];
  let snap = Btype.snapshot () in
  try eqtype_list rename type_pairs subst env tl1 tl2; backtrack snap
  with exn -> backtrack snap; raise exn

so it would not slow down even for the error case. However it would be more complicated.

cristianoc requested review from mununki, cknitt and ryyppy June 6, 2023 10:44

mununki approved these changes Jun 6, 2023

View reviewed changes

cristianoc force-pushed the speed_up_record_inclusion_check branch from b335eef to cda4816 Compare June 7, 2023 07:24

cristianoc added 3 commits June 7, 2023 09:27

Add text for record inclusion.

4e458c0

So that there's some minimal sanity check that record type inclusion works as expected on a nontrivial case.

Update CHANGELOG.md

0701238

Refactor into a single function.

e8361ff

cristianoc merged commit a053b62 into master Jun 7, 2023
7 checks passed

cristianoc deleted the speed_up_record_inclusion_check branch June 7, 2023 07:51

fhammerschmidt mentioned this pull request Oct 2, 2023

Document v11 features. rescript-association/rescript-lang.org#697

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up record inclusion check. #6289

Speed up record inclusion check. #6289

cristianoc commented Jun 6, 2023 •

edited

Loading

mununki left a comment

cknitt commented Jun 6, 2023

zth commented Jun 6, 2023

mununki commented Jun 7, 2023 •

edited

Loading

cristianoc commented Jun 7, 2023

cristianoc commented Jun 7, 2023

Speed up record inclusion check. #6289

Speed up record inclusion check. #6289

Conversation

cristianoc commented Jun 6, 2023 • edited Loading

mununki left a comment

Choose a reason for hiding this comment

cknitt commented Jun 6, 2023

zth commented Jun 6, 2023

mununki commented Jun 7, 2023 • edited Loading

cristianoc commented Jun 7, 2023

cristianoc commented Jun 7, 2023

cristianoc commented Jun 6, 2023 •

edited

Loading

mununki commented Jun 7, 2023 •

edited

Loading