Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minor optimize IndexMerger's MMappedIndexRowIterable #2084

Merged
merged 1 commit into from
Dec 18, 2015
Merged

minor optimize IndexMerger's MMappedIndexRowIterable #2084

merged 1 commit into from
Dec 18, 2015

Conversation

binlijin
Copy link
Contributor

Copy from IndexMaker.

@binlijin
Copy link
Contributor Author

The performance numbers is:
Have 95 dimensions (this is useful for having many dimensions.)
Before:
2015-12-11T14:59:52,637 INFO [main] io.druid.segment.IndexMerger - outDir[/var/folders/z7/g5zy3kfj7t54y1f074hsqw7h0000gn/T/base5473179426075742751flush/merged/v8-tmp] walked 500,000/500,000 rows in 14,559 millis.
2015-12-11T15:00:11,781 INFO [main] io.druid.segment.IndexMerger - outDir[/var/folders/z7/g5zy3kfj7t54y1f074hsqw7h0000gn/T/base5473179426075742751flush/merged/v8-tmp] completed walk through of 916,822 rows in 33,761 millis.

2015-12-11T15:01:42,938 INFO [main] io.druid.segment.IndexMerger - outDir[/var/folders/z7/g5zy3kfj7t54y1f074hsqw7h0000gn/T/base1258220153873171419flush/merged/v8-tmp] walked 500,000/500,000 rows in 15,017 millis.
2015-12-11T15:02:01,454 INFO [main] io.druid.segment.IndexMerger - outDir[/var/folders/z7/g5zy3kfj7t54y1f074hsqw7h0000gn/T/base1258220153873171419flush/merged/v8-tmp] completed walk through of 916,822 rows in 33,598 millis.

After:
2015-12-11T14:47:48,135 INFO [main] io.druid.segment.IndexMerger - outDir[/var/folders/z7/g5zy3kfj7t54y1f074hsqw7h0000gn/T/base4977721030563756003flush/merged/v8-tmp] walked 500,000/500,000 rows in 13,541 millis.
2015-12-11T14:48:06,100 INFO [main] io.druid.segment.IndexMerger - outDir[/var/folders/z7/g5zy3kfj7t54y1f074hsqw7h0000gn/T/base4977721030563756003flush/merged/v8-tmp] completed walk through of 916,822 rows in 31,570 millis.

2015-12-11T14:50:33,574 INFO [main] io.druid.segment.IndexMerger - outDir[/var/folders/z7/g5zy3kfj7t54y1f074hsqw7h0000gn/T/base2937734964603563157flush/merged/v8-tmp] walked 500,000/500,000 rows in 14,069 millis.
2015-12-11T14:50:51,457 INFO [main] io.druid.segment.IndexMerger - outDir[/var/folders/z7/g5zy3kfj7t54y1f074hsqw7h0000gn/T/base2937734964603563157flush/merged/v8-tmp] completed walk through of 916,822 rows in 32,012 millis.

@nishantmonu51
Copy link
Member

+1

@drcrallen
Copy link
Contributor

@binlijin what did you run the benchmarks on, ec2 or your laptop? (or some other dedicated machine)

@fjy
Copy link
Contributor

fjy commented Dec 11, 2015

@drcrallen China doesn't have EC2, it has AliCloud

edit: my bad, it does have EC2 :P

@drcrallen
Copy link
Contributor

allright I'll be more explicit: Given the short timerange of the tests presented, and the small improvement (small improvements are good if they are real!) are variations such as frequency governor fluctuations taken into account?

@binlijin
Copy link
Contributor Author

@drcrallen I run it on my laptop.

@navis
Copy link
Contributor

navis commented Dec 14, 2015

@binlijin Could you include the test you've used in druid? We might need standard test for the performance.

@binlijin
Copy link
Contributor Author

@navis, i test it with our one million real data.

@xvrl
Copy link
Member

xvrl commented Dec 14, 2015

@binlijin can you try running on a dedicated machine, the difference you are seeing could just be due to clock rate / temperature fluctuations on your laptop is what @drcrallen is saying

@binlijin
Copy link
Contributor Author

I test it on our test machine, the performance number is:
Before
2015-12-15 06:29:47,050 INFO [main] segment.IndexMerger (Logger.java:info(70)) - outDir[/tmp/base8567491474144566728flush/merged/v8-tmp] walked 500,000/500,000 rows in 21,429 millis.
2015-12-15 06:30:11,859 INFO [main] segment.IndexMerger (Logger.java:info(70)) - outDir[/tmp/base8567491474144566728flush/merged/v8-tmp] completed walk through of 916,822 rows in 46,276 millis.

2015-12-15 06:35:17,978 INFO [main] segment.IndexMerger (Logger.java:info(70)) - outDir[/tmp/base8564891586222722664flush/merged/v8-tmp] walked 500,000/500,000 rows in 22,278 millis.
2015-12-15 06:35:43,443 INFO [main] segment.IndexMerger (Logger.java:info(70)) - outDir[/tmp/base8564891586222722664flush/merged/v8-tmp] completed walk through of 916,822 rows in 47,780 millis.

2015-12-15 06:37:38,199 INFO [main] segment.IndexMerger (Logger.java:info(70)) - outDir[/tmp/base7286842343919247533flush/merged/v8-tmp] walked 500,000/500,000 rows in 21,865 millis.
2015-12-15 06:38:01,388 INFO [main] segment.IndexMerger (Logger.java:info(70)) - outDir[/tmp/base7286842343919247533flush/merged/v8-tmp] completed walk through of 916,822 rows in 45,092 millis.

After
2015-12-15 06:08:30,580 INFO [main] segment.IndexMerger (Logger.java:info(70)) - outDir[/tmp/base7086414238284544897flush/merged/v8-tmp] walked 500,000/500,000 rows in 20,996 millis.
2015-12-15 06:08:55,490 INFO [main] segment.IndexMerger (Logger.java:info(70)) - outDir[/tmp/base7086414238284544897flush/merged/v8-tmp] completed walk through of 916,822 rows in 45,943 millis.

2015-12-15 06:10:55,844 INFO [main] segment.IndexMerger (Logger.java:info(70)) - outDir[/tmp/base3754891995514154122flush/merged/v8-tmp] walked 500,000/500,000 rows in 19,548 millis.
2015-12-15 06:11:20,352 INFO [main] segment.IndexMerger (Logger.java:info(70)) - outDir[/tmp/base3754891995514154122flush/merged/v8-tmp] completed walk through of 916,822 rows in 44,095 millis.

2015-12-15 06:13:21,332 INFO [main] segment.IndexMerger (Logger.java:info(70)) - outDir[/tmp/base2471600163749106438flush/merged/v8-tmp] walked 500,000/500,000 rows in 21,338 millis.
2015-12-15 06:13:45,834 INFO [main] segment.IndexMerger (Logger.java:info(70)) - outDir[/tmp/base2471600163749106438flush/merged/v8-tmp] completed walk through of 916,822 rows in 45,878 millis.

It is slow because my laptop is SSD and this machine is not.

processor : 23
vendor_id : GenuineIntel
cpu family : 6
model : 45
model name : Intel(R) Xeon(R) CPU E5-2430 0 @ 2.20GHz
stepping : 7
cpu MHz : 2194.333
cache size : 15360 KB
physical id : 1
siblings : 12
core id : 5
cpu cores : 6
apicid : 43
initial apicid : 43
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes

@fjy
Copy link
Contributor

fjy commented Dec 18, 2015

👍

fjy added a commit that referenced this pull request Dec 18, 2015
minor optimize IndexMerger's MMappedIndexRowIterable
@fjy fjy merged commit 9e6874c into apache:master Dec 18, 2015
@fjy fjy modified the milestone: 0.9.0 Feb 4, 2016
@fjy fjy mentioned this pull request Feb 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants