New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster dict implementation by improving code generation #879

Closed
Skinney opened this Issue Jun 24, 2017 · 5 comments

Comments

Projects
None yet
3 participants
@Skinney
Contributor

Skinney commented Jun 24, 2017

So I ran a simple benchmark, and got some interesting results. Here is the benchmark:

        describe "Dict"
            [ Benchmark.compare "Dict Get"
                (benchmark2 "First" Dict.get 1 dict)
                (benchmark2 "Last" Dict.get (n - 1) dict)
            ]

And here are the results:

bench_orig

Retrieving the last/highest/greatest element in a dictionary is slower than retrieving the first/lowest/least. This seemed odd, so I did some investigating. I'm not going to list up everything I tried, but those that made a difference.

Every change I did was made to the compiled javascript of Dict.get. It originally looks like this:

var _elm_lang$core$Dict$get = F2(
	function (targetKey, dict) {
		get:
		while (true) {
			var _p15 = dict;
			if (_p15.ctor === 'RBEmpty_elm_builtin') {
				return _elm_lang$core$Maybe$Nothing;
			} else {
				var _p16 = A2(_elm_lang$core$Basics$compare, targetKey, _p15._1);
				switch (_p16.ctor) {
					case 'LT':
						var _v20 = targetKey,
							_v21 = _p15._3;
						targetKey = _v20;
						dict = _v21;
						continue get;
					case 'EQ':
						return _elm_lang$core$Maybe$Just(_p15._2);
					default:
						var _v22 = targetKey,
							_v23 = _p15._4;
						targetKey = _v22;
						dict = _v23;
						continue get;
				}
			}
		}
	});

If one removes the continue get; statements of that code, as well as the get label above the while-statement, this is the result:

bench_no_label

That's a hefty performance improvement for the "first/lowest/least" case. To improve performance for "last/greatest/highest", one needs to change the order of the switch statements, so that the function now looks like this:

var _elm_lang$core$Dict$get = F2(
	function (targetKey, dict) {
		while (true) {
			var _p15 = dict;
			if (_p15.ctor === 'RBEmpty_elm_builtin') {
				return _elm_lang$core$Maybe$Nothing;
			} else {
				var _p16 = A2(_elm_lang$core$Basics$compare, targetKey, _p15._1);
				switch (_p16.ctor) {
					case 'LT':
						var _v20 = targetKey,
							_v21 = _p15._3;
						targetKey = _v20;
						dict = _v21;
					case 'GT':
						var _v22 = targetKey,
							_v23 = _p15._4;
						targetKey = _v22;
						dict = _v23;
					default:
						return _elm_lang$core$Maybe$Just(_p15._2);
				}
			}
		}
	});

Then you get the following performance:

bench_final

I also tried removing the label and continue-statements from other recursive functions like List.foldl and (my implementation of) Array.get, but saw no performance difference.

A couple of takeways:

  1. Removing labels and continue statements were they are not needed can significantly improve performance for complex recursive functions (were the compiler would generate more than two continue statements). Based on what I've read, JIT compilers have trouble optimizing such loops. Removing a conditional break-statement from the initializeFromList function in the new Array implementation confirms this, as performance increased 2-3 times. In other cases, it would simply reduce code size.
  2. Switch ordering really matters, which surprised me. In the case of Dict.get it matters a lot as the EQ case is only triggered once, if at all, while LT and GT cases are triggered "all the time".
  3. Loops not optimized properly, for instance because of complex loop-logic, can hinder other optimizations. Changing the switch ordering so GT comes right after LT, without removing labels and continue-statements, have no effect. Only when removing the label and continue statements is performance similar.

I don't know how difficult this is to implement, but it seems worthwhile :)

Also, it would be nice if someone could confirm my results. Here is a repo which makes it easy to get started: https://github.com/Skinney/elm-dict-exploration

@process-bot

This comment has been minimized.

Show comment
Hide comment
@process-bot

process-bot Jun 24, 2017

Thanks for the issue! Make sure it satisfies this checklist. My human colleagues will appreciate it!

Here is what to expect next, and if anyone wants to comment, keep these things in mind.

process-bot commented Jun 24, 2017

Thanks for the issue! Make sure it satisfies this checklist. My human colleagues will appreciate it!

Here is what to expect next, and if anyone wants to comment, keep these things in mind.

@eeue56

This comment has been minimized.

Show comment
Hide comment
@eeue56

eeue56 Jun 24, 2017

Contributor

Switch ordering really matters, which surprised me. In the case of Dict.get it matters a lot as the EQ case is only triggered once, if at all, while LT and GT cases are triggered "all the time".
Loops not optimized properly, for instance because of complex loop-logic, can hinder other optimizations. Changing the switch ordering so GT comes right after LT, without removing labels and continue-statements, have no effect. Only when removing the label and continue statements is performance similar.

This is easy to do, right now -- the last case in a case..of is always used as default. Simply changing the ordering in the Elm file will make that work: https://github.com/elm-lang/core/blob/master/src/Dict.elm#L107

Contributor

eeue56 commented Jun 24, 2017

Switch ordering really matters, which surprised me. In the case of Dict.get it matters a lot as the EQ case is only triggered once, if at all, while LT and GT cases are triggered "all the time".
Loops not optimized properly, for instance because of complex loop-logic, can hinder other optimizations. Changing the switch ordering so GT comes right after LT, without removing labels and continue-statements, have no effect. Only when removing the label and continue statements is performance similar.

This is easy to do, right now -- the last case in a case..of is always used as default. Simply changing the ordering in the Elm file will make that work: https://github.com/elm-lang/core/blob/master/src/Dict.elm#L107

@Skinney

This comment has been minimized.

Show comment
Hide comment
@Skinney

Skinney Jun 24, 2017

Contributor

I know. But as mentioned, it has no effect on performance without removing the label and continue-statements first. I think it might be because it prevents optimizing the switch statement or something :/

Contributor

Skinney commented Jun 24, 2017

I know. But as mentioned, it has no effect on performance without removing the label and continue-statements first. I think it might be because it prevents optimizing the switch statement or something :/

@eeue56

This comment has been minimized.

Show comment
Hide comment
@eeue56

eeue56 Jun 24, 2017

Contributor

We discussed on Slack and weren't able to find/think of a place where continue...label would be required, based on the current compiler. Even edgecases like a TCO function being defined inside the body of another TCO function do not need the continue. V8 seems to do no form of optimisation for continue/label jumps like this, as far as I can tell from the code. This might be relevant too.

Contributor

eeue56 commented Jun 24, 2017

We discussed on Slack and weren't able to find/think of a place where continue...label would be required, based on the current compiler. Even edgecases like a TCO function being defined inside the body of another TCO function do not need the continue. V8 seems to do no form of optimisation for continue/label jumps like this, as far as I can tell from the code. This might be relevant too.

@Skinney

This comment has been minimized.

Show comment
Hide comment
@Skinney

Skinney Jun 25, 2017

Contributor

I just realized that I posted this in the wrong repo, this was supposed to go to the elm-lang/compiler repo :(

I'll try to move it.

Contributor

Skinney commented Jun 25, 2017

I just realized that I posted this in the wrong repo, this was supposed to go to the elm-lang/compiler repo :(

I'll try to move it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment