Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvement (up to 10x) of setDataSet #24

Closed
wants to merge 3 commits into from

Conversation

lukasberns
Copy link

setDataSet set was colorizing overlapping rectangles over and over when using a large set with setDataSet (e.g. 1000). I implemented a simplistic algorithm to reduce overlapping rectangles to their bounding box. On my Chrome, the execution time of setDataSet(1000) in the static_heatmap demo went down from 1000ms to about 60ms.

The algorithm could be further improved to avoid colorizing of whitespace.

I love heatmap.js :)

@matthewford
Copy link

Looks like a great addition, is it possible to rebase on the current master branch

@lukasberns
Copy link
Author

Is it ok to rebase what you have already pushed to github?
I know someone who already pulled from my repo.

Merge would do it too, no?

@drupol
Copy link

drupol commented Aug 14, 2012

I'm really willing to test this :)

@pa7
Copy link
Owner

pa7 commented Aug 15, 2012

@Polzme could you check whether this implementation performs better than the recent performance improvements in the master branch? (if yes, please use the file located in the src folder)

@drupol
Copy link

drupol commented Aug 16, 2012

Hello Patrick,

I've tested it and it works pretty good too.
I still need to find a way to compare the two methods, currently, I can't say which one is faster.
What's the best way to compare in speed two JS ? Do you have advice to give ?

Thanks!

@drupol
Copy link

drupol commented Aug 16, 2012

You can see it in action here: http://nid-de-poule.net/

@pa7
Copy link
Owner

pa7 commented Aug 16, 2012

Thanks for the fast response!
The master branch currently contains a folder called tests with the file 'performance-0.html'.
If you open this file once with the master branches src/heatmap.js and afterwards with the heatmap.js from the PR you can see how long it takes to render 1000 datapoints (there is an element containing the rendertime).
I will add bigger datasets the next week.
Please let me know about the results, thanks!

@lukasberns
Copy link
Author

To measure performance, please apply cb53a0c. My previous implementation was using setTimeout(), so the actual coloring was probably running after the performance timer had already stopped.

If I understand the improvements in 23af81c correctly, the duplicate colorizing issue is fixed. My implementation goes one step further and tries to avoid colorizing whitespace by merging intersecting rectangles first and then colorizing them. Of course this rectangle-merging algorithm costs time as well, so I assume the performance improvement depends on the data set.

The difference should be especially strong if you have a data set with multiple "islands". The current pa7 master branch would go over the whole bounding box, including the "water" between the islands, while with rectangle-merging only the bounding boxes of the islands are colorized.

For a data set with only one "island" the pa7 master branch should be slightly faster but it might not even be measurable at 1000 data points.

@drupol
Copy link

drupol commented Aug 16, 2012

Thanks @pa7, I'll do that in the afternoon !

@lukasberns, I've pulled your changes into http://nid-de-poule.net/ and it looks like it's even faster.

I can't be sure until i've run some more tests, but it looks better :)

@lukasberns
Copy link
Author

Sweet.

BTW @Polzme you seem to get a few data points at 0,0 (probably due to "undefined" indices). Please let me know if that appeared with my patch or if it was already there before.

@drupol
Copy link

drupol commented Aug 16, 2012

It's now using the @pa7 version, without your patch, you can also check if you want.

@lukasberns
Copy link
Author

Ok it's still there so I assume that's an issue on your end (data generation etc.)

@drupol
Copy link

drupol commented Aug 16, 2012

Using the @pa7 repository: http://www.webpagescreenshot.info/img/619463-816201290314PM
Using the @lukasberns repository: http://www.webpagescreenshot.info/img/938288-816201290459PM

Results are less or more the same, it's impossible to tell which one is faster according to these numbers.

@drupol
Copy link

drupol commented Aug 18, 2012

@lukasberns: Can you provide an example where we clearly see the difference ?

Thanks !

@lukasberns
Copy link
Author

Ok, here an extreme example: make a canvas with size 1000x800

Data A: max=1, points=[ { x: 1, y: 1 }, { x: 998, y: 798 } ]
Data B: max=1, points=[ { x: 499, y: 399 }, { x: 500, y: 400 } ]

Data A should have an extreme difference, while the difference in data B should be intelligible.

Oh and for your test with 207 ms vs 203 ms, try running the same test 100 times in a loop and measure the total time (about 20s). If you are using random data, randomize it for every run of the loop.

@lukasberns
Copy link
Author

Actually, for some reason my branch is still much faster.

// #heatmapArea has size 600x400
var xx = h337.create({"element":document.getElementById("heatmapArea"), "radius":25, "visible":true});

var caseA = { max:1, data:[] }; // two islands
var caseB = { max:1, data:[] }; // more or less uniform
for (var x = 0; x < 30; x++) {
    for (var y = 0; y < 20; y++) {
        caseA.data.push({ x:x, y:y, count:1 });
        caseA.data.push({ x:600-x, y:400-y, count:1 });

        caseB.data.push({ x:20*x, y:10*y, count:1 });
        caseB.data.push({ x:600-(20*x), y:400-(10*y), count:1 });
    }
}

var times = {};
var repeat = 100;

var start = +new Date;
for (var i = 0; i < repeat; i++)
    xx.store.setDataSet(caseA);
times.a = (+new Date - start)/repeat;

start = +new Date;
for (var i = 0; i < repeat; i++)
    xx.store.setDataSet(caseB);
times.b = (+new Date - start)/repeat;

alert('A: '+times.a+'ms, B: '+times.b+' ms');

@lukasberns repo A: 27.08 ms, B: 38.12 ms (3feea5d)
@pa7 repo A: 181.94 ms, B: 193.43 ms (05855c1)

Data A should have an extreme difference, while the difference in data B should be intelligible.

So yeah, as you can see contrary to my expectation, although B is slower on the @lukasberns branch, it's still faster than the @pa7. Don't really know why, it should be doing the same lol.

@muloka
Copy link

muloka commented Feb 20, 2013

👍

@pa7
Copy link
Owner

pa7 commented Apr 19, 2013

I didn't experience the same effect, the differences were not significant for me.
did you test it with multiple devices/browsers?

@pa7 pa7 closed this Nov 28, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants