Skip to content
Newer
Older
100644 273 lines (184 sloc) 12.2 KB
a3b2c2f Params from final tuning
Philip (flip) Kromer authored Dec 3, 2010
1 To do a full flush, do this:
2
3 curl -XPOST host:9200/_flush?full=true
4
5 (run it every 30 min during import)
6
7
2f33fc5 Numbers from benchmarking runs to bulk load
Philip (flip) Kromer authored Nov 29, 2010
8 1 c1.xl 4es/12sh 768m buffer 1400m heap
9 2,584,346,624 255,304 0h14m25 865 295 2917
10
11 1 m1.xl 4es/12sh 1500m buffer 3200m heap
12 79,364,096 464,701 0h01m02 62 7495 1250
13 210,305,024 1,250,000 0h02m39 159 7861 1291
14 429,467,863 2,521,538 0h03m28 208 12122 2016
15
16 1 m1.xl 4es/12sh 4hdp 1800m buffer 3200m heap 300000 tlog
17 429,467,863 2,521,538 0h03m11 191 13201 2195
18
19 1 m1.xl 4es/12sh 4hdp 1800m buffer 2400m heap 100000 tlog 1000 batch lzw compr ulimit-l-unlimited (and in all following)
20 0h03m47
21
22 1 m1.xl 4es/12sh 4hdp 1800m buffer 2400m heap 200000 tlog 1000 batch no compr
23 0h3m22
24 again on top of data already loaded
25 0h3m16
26
27 1 m1.xl 4es/12sh 64hdp 1800m buffer 2800m heap 200000 tlog 50000 batch no compr
28 433,782,784 2,250,000 0h01m17 (froze up on mass assault once 50k batch was reached)
29
30 1 m1.xl 4es/12sh 64hdp 1800m buffer 2800m heap 200000 tlog 5000 batch no compr
31 785,514,496 4,075,000 0h05m59 359 11350 2136 cpu 4x70%
32 1,207,500,800 6,270,000 0h08m26 506 12391 2330
33
34 1 m1.xl 4es/12sh 64hdp 1800m buffer 2800m heap 200000 tlog 5000 batch no compr
35 163,512,320 845,000 0h01m49 109 7752 1464 cpu 4x75% ios 6k-8k x4 if 2800/440 ram 13257/15360MB
36 641,990,656 3,345,000 0h04m41 281 11903 2231
37 896,522,559 4,683,016 0h06m11 371 12622 2359
38 1,131,916,976 5,937,895 0h07m05 425 13971 2600
39
40 1 m1.xl 4es/12sh 16hdp 1800m buffer 2800m heap 200000 tlog 5000 batch no compr
41 74,383,360 385,000 0h01m50 110 3500 660
42 286,720,000 1,495,000 0h02m21 141 10602 1985
43 461,701,120 2,410,000 0h03m30 210 11476 2147
44 733,413,376 3,830,000 0h05m10 310 12354 2310
45 1,131,916,976 5,937,895 0h07m16 436 13619 2535
46
47 1 m1.xl 4es/12sh 64hdp 1800m buffer 2800m heap 200000 tlog 1000 batch no compr
48 156,958,720 813,056 0h01m35 95 8558 1613
49 305,135,616 1,586,176 0h02m25 145 10939 2055
50 446,300,160 2,323,456 0h03m10 190 12228 2293
51 690,028,544 3,594,240 0h04m40 280 12836 2406
52 927,807,418 4,850,093 0h06m10 370 13108 2448
53 1,131,916,976 5,937,895 0h06m55 415 14308 2663
54
55 1 m1.xl 4es/12sh 16hdp 1800m buffer 2800m heap 200000 tlog 1024 batch no compr
56 234,749,952 1,222,656 0h02m08 128 9552 1791
57 713,097,216 3,723,264 0h04m56 296 12578 2352
58 1,131,916,976 5,937,895 0h06m49 409 14518 2702
59
60 1 m1.xl 4es/12sh 20hdp 1800m buffer 2800m heap 200000 tlog 1024 batch no compr mergefac 40
61 190,971,904 994,304 0h01m55 115 8646 1621
62 326,107,136 1,699,840 0h02m52 172 9882 1851
63 707,152,365 3,709,734 0h04m51 291 12748 2373 672 files
64 again:
65 187,170,816 973,824 0h01m49 109 8934 1676
66 707,152,365 3,709,734 0h05m39 339 10943 2037 1440 files ; 18 *.tis typically 4.3M
67 again:
68 707,152,365 3,709,734 0h04m54 294 12618 2348 2052 files ; 28 *.tis typically 4.3M
69
70 1 m1.xl 4es/12sh 20hdp 1800m buffer 2800m heap 50_000 tlog 1024 batch no compr mergefac 20 (and in following)
71 349,372,416 1,821,696 0h02m42 162 11245 2106
72 707,152,365 3,709,734 0h04m43 283 13108 2440
73
74 1 m1.xl 4es/4sh 20hdp 1800m buffer 2800m heap 200_000 tlog 1024 batch no compr 64m engine.ram_buffer_size -- 3s ping_interval -- oops 10s refresh
75 253,689,856 1,321,984 0h02m48 168 7868 1474
76 707,152,365 3,709,734 0h05m55 355 10449 1945
77
78 1 m1.xl 4es/4sh 20hdp 1800m buffer 2800m heap 200_000 tlog 1024 batch no compr 256m engine.ram_buffer_size -- 3s ping_interval
79 707,152,365 3,709,734 0h04m31 271 13689 2548
80
81 1 m1.xl 4es/4sh 20hdp 1800m buffer 2800m heap 200_000 tlog 1024 batch no compr 512m engine.ram_buffer_size -- 3s ping_interval
82 707,152,365 3,709,734 0h04m08 248 14958 2784
83
84 1 m1.xl 4es/4sh 20hdp 1800m buffer 2800m heap 200_000 tlog 1024 batch no compr 768m engine.ram_buffer_size -- 3s ping_interval
85 707,152,365 3,709,734 0h04m47 287 12925 2406
86 again
87 707,152,365 3,709,734 0h04m27 267 13894 2586
88
89 1 m1.xl 4es/4sh 20hdp 768m buffer 2800m heap 200_000 tlog 1024 batch no compr 512m engine.ram_buffer_size -- 3s ping_interval
90 707,152,365 3,709,734 0h04m14 254 14605 2718
91
92 1 c1.xl 4es/4sh 20hdp 768m buffer 1200m heap 200_000 tlog 1024 batch no compr 512m engine.ram_buffer_size -- 3s ping_interval
93 707,152,365 3,709,734 0h02m55 175 21198 3946 ios 11282 ifstat 3696.26 695.26
94
95 1 c1.xl 4es/4sh 40hdp 768m buffer 1200m heap 200_000 tlog 4096 batch no compr 512m engine.ram_buffer_size -- 3s ping_interval
96 707,912,831 3,713,598 0h03m05 185 20073 3736
97
98 1 c1.xl 4es/4sh 40hdp 768m buffer 1200m heap 200_000 tlog 1024 batch no compr 512m engine.ram_buffer_size -- 3s ping_interval
99 707,912,831 3,713,598 0h02m59 179 20746 3862
100
101 1 c1.xl 4es/4sh 20hdp 256m buffer 1200m heap 200_000 tlog 1024 batch no compr 512m engine.ram_buffer_size -- 3s ping_interval
102 707,152,365 3,709,734 0h02m53 173 21443 3991
103
104 1 c1.xl 4es/4sh 20hdp 512m buffer 1200m heap 200_000 tlog 1024 batch no compr 768m engine.ram_buffer_size -- 3s ping_interval
105 707,152,365 3,709,734 0h03m00 180 20609 3836
106
107
108 8 c1.xl 32es/32sh 14hdp/56 512m buffer 1200m heap 200_000 tlog 1024 batch no compr 512m engine.ram_buffer_size -- 3s ping_interval
109 1,115,291,648 5,814,272 0h01m44 104 6988 1309 8 55906 10472
110 2,779,840,512 14,540,800 0h06m34 394 4613 861 8 36905 6890
111 6,100,156,416 32,508,928 0h14m51 891 4560 835 8 36485 6685
112 (killed)
113
114 8 c1.xl 24es/24sh 14hdp/56 256m buffer 1200m heap 200_000 tlog 1024 batch no compr 384m engine.ram_buffer_size -- 3s ping_interval
115 980,221,952 5,107,662 0h01m28 88 7255 1359 8 58041 10877
116 1,815,609,344 9,483,259 0h01m59 119 9961 1862 8 79691 14899
117 4,451,270,656 23,694,336 0h04m06 246 12039 2208 8 96318 17670
118 6,713,269,627 35,778,171 0h06m00 360 12422 2276 8 99383 18210
119
120 8 c1.xl 24es/24sh 14hdp/140 512m buffer 1200m heap 200_000 tlog 1024 batch no compr 384m engine.ram_buffer_size -- 3s ping_interval
121 4,743,036,929 24,825,856 0h04m39 279 11122 2075 8 88981 16601
122 8,119,975,937 42,889,216 0h07m00 420 12764 2360 8 102117 18880
123 17,273,994,924 91,991,529 0h15m14 914 12580 2307 8 100647 18456
124 23,598,696,768 123,812,641 0h24m04 1444 10717 1994 8 85742 15959
125
126
127 8 m1.xl 32es/32sh 14hdp/53 1800m buffer 2800m heap 200_000 tlog 1024 batch no compr 512m engine.ram_buffer_size -- 3s ping_interval -- merge_factor30
128 306,296,262 1,608,526 0h01m18 78 2577 479 8 20622 3834
129 1,814,083,014 9,564,301 0h02m33 153 7813 1447 8 62511 11578
130 2,837,886,406 15,030,140 0h04m49 289 6500 1198 8 52007 9589
131 3,928,208,838 21,039,950 0h06m22 382 6884 1255 8 55078 10042
132 6,322,378,160 33,875,546 0h11m28 688 6154 1121 8 49237 8974
133
134 8 c1.xl 24es/24sh 14hdp/140 512m buffer 1200m heap 200_000 tlog 4096 batch no compr 256m engine.ram_buffer_size -- 3s ping_interval -- merge_factor 30
135 4,717,346,816 24,855,996 0h04m55 295 10532 1952 8 84257 15616
136 9,735,831,552 51,896,969 0h09m23 563 11522 2110 8 92179 16887
137
138
139 (200910)
140 2,746,875,904 10,555,392 0h02m50 170 7761 1972 8 62090 15779
141 43,201,339,007 166,049,864 0h35m06 2106 9855 2504 8 78846 20032
142
143
a3b2c2f Params from final tuning
Philip (flip) Kromer authored Dec 3, 2010
144 2009{10,11,12}
145
146 8 c1.xl 24es/24sh 14hdp/140 512m buffer 1200m heap 200_000 tlog 4096 batch no compr 256m engine.ram_buffer_size -- 3s ping_interval -- merge_factor 30
147 135,555,262,283 516,220,825 2h16m13 8173 7895 2024 8 63161 16197
148
2f33fc5 Numbers from benchmarking runs to bulk load
Philip (flip) Kromer authored Nov 30, 2010
149
150 slug=tweet-2009q3pre ; curl -XGET 'http://10.99.10.113:9200/_flush/' ; curl -XPUT "http://10.99.10.113:9200/$slug/" ; rake -f ~/ics/backend/wonderdog/java/Rakefile ; ~/ics/backend/wonderdog/java/bin/wonderdog --rm --index_name=$slug --bulk_size=4096 --object_type=tweet /tmp/tweet_by_month-tumbled/"tweet-200[678]" /tmp/es_bulkload_log/$slug
151
152
153 sudo kill `ps aux | egrep '^61021' | cut -c 10-15`
154
155 for node in '' 2 3 ; do echo $node ; sudo node=$node ES_MAX_MEM=1600m ~/ics/backend/wonderdog/config/run_elasticsearch-2.sh ; done
156
157
158 for node in '' 2 3 4 ; do echo $node ; sudo node=$node ES_MAX_MEM=1200m ~/ics/backend/wonderdog/config/run_elasticsearch-2.sh ; done
159 sudo kill `ps aux | egrep '^61021' | cut -c 10-15` ; sleep 10 ; sudo rm -rf /mnt*/elasticsearch/* ; ps auxf | egrep '^61021' ; zero_log /var/log/elasticsearch/hoolock.log
160
161 ec2-184-73-41-228.compute-1.amazonaws.com
162
163 Query for success:
164 curl -XGET 'http://10.195.10.207:9200/tweet/tweet/_search?q=text:mrflip' | ruby -rubygems -e 'require "json" ; puts JSON.pretty_generate(JSON.load($stdin))'
165
166 Detect settings:
167 grep ' with ' /var/log/elasticsearch/hoolock.log | egrep 'DEBUG|INFO' | cut -d\] -f2,3,5- | sort | cutc | uniq -c
168
169 Example index sizes:
170 ls -lRhart /mnt*/elasticsearch/data/hoolock/nodes/*/indices/tweet/0/*/*.{tis,fdt}
171
172
173
174
175 def dr(line) ; sbytes,srecs,time,mach,*_ = line.strip.split(/\s+/) ; bytes = sbytes.gsub(/\D/,"").to_i ; recs = srecs.gsub(/\D/,"").to_i ; mach=mach.to_i ; mach = 1 if mach == 0 ; s,m,h = [0,0,0,time.split(/\D/)].flatten.reverse.map(&:to_i) ; tm = (3600*h + 60*m + s) ; results = "%14s\t%12s\t%01dh%02dm%02d\t%7d\t%7d\t%7d\t%7d\t%7d\t%7d"%[sbytes, srecs, h,m,s, tm, recs/tm/mach, bytes/tm/1024/mach, mach, recs/tm, bytes/tm/1024, ] ; puts results ; results ; end
176
177
178
179
180
181
182
183
184
185
186
187
188
fdedd44 @thedatachef Moved a bunch of shit into subdirectories
thedatachef authored Nov 10, 2010
189 # . jack up batch size and see effect on rec/sec, find optimal
190 # . run multiple mappers with one data es_node with optimal batch size, refind if necessary
191 # . work data es_node heavily but dont drive it into the ground
192 # . tune lucene + jvm options for data es_node
193
381617f Have things up to 6.4 kRec/s
Philip (flip) Kromer authored Nov 23, 2010
194 14 files, 3 hadoop nodes w/ 3 tasktrackers each 27 min
195 14 files, 3 hadoop nodes w/ 5 tasktrackers each 22 min
196
197 12 files @ 500k lines -- 3M rec -- 3 hdp/2 tt -- 2 esnodes -- 17m
198
199
200 6 files @ 100k = 600k rec -- 3hdp/2tt -- 1 es machine/2 esnodes -- 3m30
201 6 files @ 100k = 600k rec -- 3hdp/2tt -- 1 es machine/4 esnodes -- 3m20
202
203
204
2f33fc5 Numbers from benchmarking runs to bulk load
Philip (flip) Kromer authored Nov 30, 2010
205
381617f Have things up to 6.4 kRec/s
Philip (flip) Kromer authored Nov 23, 2010
206 5 files, 3 nodes,
207
208
209 Did 2,400,000 recs 24 tasks 585,243,042 bytes -- 15:37 on 12 maps/3nodes
210
211 Did _optimize
212 real 18m29.548s user 0m0.000s sys 0m0.000s pct 0.00
213
214
215 java version "1.6.0_20"
216 Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
217 Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
218
219
220 ===========================================================================
603c9a5 Turned on CompressedOOPs; making it run with multiple esnodes per box
Philip (flip) Kromer authored Nov 18, 2010
221
222
223 The refresh API allows to explicitly refresh an one or more index, making all
224 operations performed since the last refresh available for search. The (near)
225 real-time capabilities depends on the index engine used. For example, the robin
226 one requires refresh to be called, but by default a refresh is scheduled
227 periodically.
228
229 curl -XPOST 'http://localhost:9200/twitter/_refresh'
230
231 The refresh API can be applied to more than one index with a single call, or even on _all the indices.
232
233
234
fdedd44 @thedatachef Moved a bunch of shit into subdirectories
thedatachef authored Nov 10, 2010
235 runs:
236 - es_machine: m1.xlarge
237 es_nodes: 1
238 es_max_mem: 1500m
239 bulk_size: 5
240 maps: 1
241 records: 100000
242 shards: 12
243 replicas: 1
244 merge_factor: 100
245 thread_count: 32
246 lucene_buffer_size: 256mb
247 runtime: 108s
248 throughput: 1000 rec/sec
249 - es_machine: m1.xlarge
250 es_nodes: 1
251 bulk_size: 5
252 maps: 1
253 records: 100000
254 shards: 12
255 replicas: 1
256 merge_factor: 1000
257 thread_count: 32
258 lucene_buffer_size: 256mb
259 runtime: 77s
260 throughput: 1300 rec/sec
261 - es_machine: m1.xlarge
262 es_nodes: 1
263 bulk_size: 5
264 maps: 1
265 records: 100000
266 shards: 12
267 replicas: 1
268 merge_factor: 10000
269 thread_count: 32
270 lucene_buffer_size: 512mb
271 runtime: 180s
272 throughput: 555 rec/sec
Something went wrong with that request. Please try again.