Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add: optimise switches #78

Merged
merged 3 commits into from Oct 17, 2020
Merged

Add: optimise switches #78

merged 3 commits into from Oct 17, 2020

Conversation

@glx22
Copy link
Contributor

@glx22 glx22 commented Dec 21, 2019

I had the idea after seeing this pastebin from Andy.

With this PR the following code

switch (FEAT_TRAINS, SELF, open_car_pony_gen_6C_switch_loading_speed_by_cargo_0, cargo_classes & bitmask(CC_PASSENGERS, CC_MAIL, CC_ARMOURED)) {
	bitmask(CC_PASSENGERS): return 6;
	bitmask(CC_MAIL): return 6;
	bitmask(CC_ARMOURED): return 6;
	return 6;
}
switch (FEAT_TRAINS, SELF, open_car_pony_gen_6C_switch_loading_speed_by_cargo_1, cargo_classes & bitmask(CC_PASSENGERS, CC_MAIL, CC_ARMOURED)) {
	bitmask(CC_PASSENGERS): return 8;
	bitmask(CC_MAIL): return 8;
	bitmask(CC_ARMOURED): return 8;
	return 8;
}
switch (FEAT_TRAINS, SELF, open_car_pony_gen_6C_switch_loading_speed_by_cargo_2, cargo_classes & bitmask(CC_PASSENGERS, CC_MAIL, CC_ARMOURED)) {
	bitmask(CC_PASSENGERS): return 11;
	bitmask(CC_MAIL): return 11;
	bitmask(CC_ARMOURED): return 11;
	return 11;
}
switch (FEAT_TRAINS, SELF, open_car_pony_gen_6C_switch_loading_speed, param[0]) {
    0: open_car_pony_gen_6C_switch_loading_speed_by_cargo_0;
    1: open_car_pony_gen_6C_switch_loading_speed_by_cargo_1;
    2: open_car_pony_gen_6C_switch_loading_speed_by_cargo_2;
}

will be replaced at compile time with

switch (FEAT_TRAINS, SELF, open_car_pony_gen_6C_switch_loading_speed, param[0]) {
    0: return 6;
    1: return 8;
    2: return 11;
}

Each replacement generates a warning and the replacement is done only if the switch expression doesn't have side effects (no STORE_TEMP() nor STORE_PERM()) and if there's a default return.

@glx22 glx22 changed the title Add: optimise switch returning constants Add: optimise switches Dec 22, 2019
@glx22
Copy link
Contributor Author

@glx22 glx22 commented Dec 22, 2019

I made some changes. Now the check for all results being the same is done earlier, and if all ranges are the same as default they are removed. Then the replacement happens if there's only default, read only expression and not a return computed value

@glx22
Copy link
Contributor Author

@glx22 glx22 commented Dec 22, 2019

@andythenorth suggested more possible optimisations. I'm waiting for test cases.

nml/ast/switch.py Outdated Show resolved Hide resolved
nml/expression/array.py Outdated Show resolved Hide resolved
nml/expression/bitmask.py Outdated Show resolved Hide resolved
nml/expression/storage_op.py Outdated Show resolved Hide resolved
nml/expression/spritegroup_ref.py Show resolved Hide resolved
nml/ast/switch.py Show resolved Hide resolved
@glx22
Copy link
Contributor Author

@glx22 glx22 commented Dec 28, 2019

Stats compiling iron-horse

nml master:

Reading lang ... 0.2 s
Reading ... 0.1 s
Init parser ... 0.0 s
Parsing ... 106.7 s
Preprocessing ... 90.5 s
Generating actions ... 17.2 s
Assigning Action2 registers ... 0.4 s
Generating strings ... 0.1 s
Collecting real sprites ... 3.6 s
Checking palette of source images ... 6.1 s
Encoding ... 38.7 s
 nmlc info: 97102 sprites, 19664 cached, 312 orphaned, 64464 duplicates, 12974 newly encoded (native)
 nmlc info: Train items: 512/65420
 nmlc info: Concurrent spritesets: 2/255 (".\generated\iron-horse.nml", line 9747)
 nmlc info: Concurrent spritegroups: 176/256 (".\generated\iron-horse.nml", line 18445)
 nmlc info: Concurrent Action2 registers: 1/127 (".\generated\iron-horse.nml", line 188355)
 nmlc info: Concurrent ActionD registers: 4/64 (".\generated\iron-horse.nml", line 10438)
 nmlc info: GRF parameter registers: 3/64
 nmlc info: Cargo translation table: 93/254
 nmlc info: Railtype translation table: 6/256
 nmlc info: D0xx strings: 16/1024
Linking actions ... 1.3 s
Writing output ... 40.2 s

NFO size: 20269741
GRF size: 18943154

this PR:

Reading lang ... 0.2 s
Reading ... 0.1 s
Init parser ... 0.0 s
Parsing ... 119.7 s
Preprocessing ... 723.0 s
Generating actions ... 19.1 s
Assigning Action2 registers ... 0.4 s
Generating strings ... 0.2 s
Collecting real sprites ... 0.8 s
Checking palette of source images ... 6.7 s
Encoding ... 12.4 s
 nmlc info: 91985 sprites, 32638 cached, 312 orphaned, 59347 duplicates, 0 newly encoded (native)
 nmlc info: Train items: 512/65420
 nmlc info: Concurrent spritesets: 2/255 (".\generated\iron-horse.nml", line 9747)
 nmlc info: Concurrent spritegroups: 176/256 (".\generated\iron-horse.nml", line 18445)
 nmlc info: Concurrent Action2 registers: 1/127 (".\generated\iron-horse.nml", line 188355)
 nmlc info: Concurrent ActionD registers: 4/64 (".\generated\iron-horse.nml", line 10438)
 nmlc info: GRF parameter registers: 3/64
 nmlc info: Cargo translation table: 93/254
 nmlc info: Railtype translation table: 6/256
 nmlc info: D0xx strings: 16/1024
Linking actions ... 1.1 s
Writing output ... 35.2 s

NFO size: 18352185
GRF size: 17702514

So compilation is a lot slower, for a very small improvement.

Edit: and then I discovered -c nmlc option, with it GRF size is the same with both nmlc and grfcodec :)

@LordAro
Copy link
Member

@LordAro LordAro commented Dec 28, 2019

A whole order of magnitude is a ridiculous slow down. Is there something specific causing it?

@glx22
Copy link
Contributor Author

@glx22 glx22 commented Dec 28, 2019

Slow down is caused by repeating loops when searching for duplicates (one for each new template, one for each new spriteset)

@Yexo
Copy link
Contributor

@Yexo Yexo commented Jun 1, 2020

It looks like the first 3 commits can be a big win already, and it's only the last 2 commits that are causing the slowdown.
Is there anything preventing the merge of the first 3 commits, leaving the last 2 for later / behind a flag / after more optimization?

@glx22 glx22 force-pushed the reduce_refs branch 2 times, most recently from b9cdffa to 0b6b5a2 Jun 2, 2020
Yexo
Yexo previously approved these changes Jun 8, 2020
Copy link
Contributor

@Yexo Yexo left a comment

Non-blocking: this could use a test case.

nml/expression/spritegroup_ref.py Show resolved Hide resolved
nml/expression/storage_op.py Outdated Show resolved Hide resolved
Yexo
Yexo previously approved these changes Jun 9, 2020
@glx22 glx22 force-pushed the reduce_refs branch 2 times, most recently from 4c848fc to a456eb9 Oct 3, 2020
@glx22 glx22 merged commit aede745 into OpenTTD:master Oct 17, 2020
19 checks passed
@glx22 glx22 deleted the reduce_refs branch Oct 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants