Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization suggestion for _lookAreaMixedRegister() #41

Open
KageJittai opened this issue Jun 25, 2017 · 11 comments
Open

Optimization suggestion for _lookAreaMixedRegister() #41

KageJittai opened this issue Jun 25, 2017 · 11 comments

Comments

@KageJittai
Copy link

In _lookAreaMixedRegister at src/game/rooms.js#L681 the engine is using a .forEach call on the keys of an object.

You should be able to optimize this code quite a bit by converting it to a for loop. It is known that Array.forEach, especially when using closure, has far worst performance (between 50% - 90%) then a standard for loop.

https://jsperf.com/fast-array-foreach

@artch
Copy link
Contributor

artch commented Jun 25, 2017

keys is an array with 9 elements, I don't really think it makes any difference here.

@artch artch closed this as completed Jun 25, 2017
@KageJittai
Copy link
Author

KageJittai commented Jun 25, 2017

I don't believe it is:

var typeRegister = privateStore[id].lookTypeRegisters[type],
keys = typeRegister && Object.keys(typeRegister);

This seems to be the actual elements of typeRegister

And if you look at the actual way the key is used:

    keys.forEach((key) => {
        var obj = typeRegister[key];
        if(checkInside(obj)) {
            if(withType) {
                item = {type: type};
                item[type] = obj;
                if(asArray) {
                    result.push({x: obj.x || obj.pos.x, y: obj.y || obj.pos.y, type, [type]: obj});

Clearly typeRegister[key] is an actual game object that a .pos and everything.

@daboross
Copy link
Contributor

daboross commented Jun 25, 2017

I think here keys is the keys of lookTypeRegisters[type], not of lookTypeRegisters itself?

Still, iterating over each room object is going to be a bit less overhead than 1000 as in the perf test (and the other part is more than a simple addition). Here's a test which is somewhat more comparable to our workload (though still definitely not exact): https://jsperf.com/screeps-engine-benchmark-test.

In Chrome 59.0.3071 / Linux, I get 507,729 ops/secs with for-each and 1,093,702 with for-loop. It's still a micro-optimization, but might this change be worth it for this code which is called frequently in user scripts?

@artch
Copy link
Contributor

artch commented Jun 25, 2017

Ah, right, my mistake. Well, in order to be sure this is not a micro-optimization, it would be nice to see some concrete benchmarks on real use cases using Node.js 6.x and Node.js 8.x (as we have plans to switch soon).

@artch artch reopened this Jun 25, 2017
@KageJittai
Copy link
Author

Ok, I've written a script using the NPM benchmark package

I also had the microtime module installed (instructions in the link above).

These were the results on my machine:

CPU: Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
RAM: 32 GB of DDR4
Linux Kernel: 4.4.0-78-generic (64-bit)

kage@Kage-Linux-Mint-Desktop ~/Workspace/Screeps/benchmarking $ nodejs -v
v6.11.0
kage@Kage-Linux-Mint-Desktop ~/Workspace/Screeps/benchmarking $ nodejs test.js 
forEach (Current) x 104,430 ops/sec ±1.55% (270 runs sampled)
for x 112,462 ops/sec ±1.69% (270 runs sampled)
Fastest is for

The script can be found here

I should note, when I in-lined the function with the test, Array.forEach was faster, assuming because of a faster compile timings, might be related with static branching of asArray and withType. But when the functions were only invoked in the benchmark and defined outside of it, for was about 8% faster. This second configuration should be more representative of actual performance gains though this was worth noting.

I do not have Nodejs version 8, so if someone can run this script on 8 to ensure performance gains on newer versions are similar.

@KageJittai
Copy link
Author

KageJittai commented Jun 30, 2017

I've ran the test again on the new version of node:

v8.1.2
kage@Kage-Linux-Mint-Desktop ~/Workspace/Screeps/benchmarking $ nodejs test.js 
forEach (Current) x 224,947 ops/sec ±1.73% (276 runs sampled)
for x 271,963 ops/sec ±1.59% (283 runs sampled)
Fastest is for

Node8 appears significantly faster in both cases. And in this case using a normal for loop is 17% faster.

These tests are with 50 objects on the map, I've noticed lowering the object count actually leads to a greater performance difference between the two. For example same test on node8 with just 10 objects on the map:

forEach (Current) x 918,054 ops/sec ±1.31% (281 runs sampled)                                                                                                               
for x 1,202,287 ops/sec ±1.43% (285 runs sampled)                                                                                                                           
Fastest is for

In the case with less objects on the map, the for loop is 23.6% faster then the forEach.

@artch
Copy link
Contributor

artch commented Jun 30, 2017

Thank you for efforts on benchmarking this!

So, if I understand this correctly, 1 look call in a room with 50 objects takes something around 0.004ms. If a developed player with 100 visible rooms calls this 3 times per room every tick, it will take ~1 ms of his CPU on this tick. And this 17% optimization will provide a negligible benefit of 0.17 ms. Definitely looks to me like a micro-optimization that doesn't matter.

@tedivm
Copy link
Contributor

tedivm commented Jun 30, 2017

I use the look functions a lot. There are some ticks where I will use it potentially hundreds of times in a single room (this is vital for automated room planning).

@daboross
Copy link
Contributor

For reference I use lookFor at least 3 times per active creep, and up to around ~13 times for an active military creep. It's more like 90-120 times per owned room for me.

But since this is lookForArea / lookForAtArea, I use that much less often. Maybe 4-5 times per military attack creep, and in total around 3 per owned room on average each tick sounds accurate.

@KageJittai
Copy link
Author

KageJittai commented Jul 2, 2017

This is used by lookForAtArea and lookAtArea. I'd image that fairly often it's used per-active creep as this is this is used by virtually every creep to look at their surroundings. I should note, that lookAtArea actually calls this lower level function 10 times: src/game/rooms.js#L769

@KageJittai
Copy link
Author

KageJittai commented Jul 2, 2017

As an example how this might affect performance, imagine a builder creep who has the subroutine:

const room = creep.room;
const pos = creep.pos;
_.each(room.lookAtArea(pos.y - 3, pos.x - 3, pos.y + 3, pos.x + 3, true), (target) => {
    if (target.type === "constructionSite") {
        creep.build(target.constructionSite);
    }
    else if (target.type === "structure" && target.structure.hits < target.structure.hitsMax) {
         creep.repair(target.structure);
    }
});

This code will call the function being purposed to be optimized 10 times a tick per builder creep, and is completely reasonable to see players especially newer players make this type of code. Many of creeps use the lookForAtArea to look for things like resources it can sweep up while it walks, or military creeps looking for enemies in attack range.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants