/
atom.xml
122 lines (83 loc) · 3.79 KB
/
atom.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title><![CDATA[Category: hadoop | Coding For Fun]]></title>
<link href="http://utopiazh.github.com/blog/categories/hadoop/atom.xml" rel="self"/>
<link href="http://utopiazh.github.com/"/>
<updated>2013-03-17T10:56:52+08:00</updated>
<id>http://utopiazh.github.com/</id>
<author>
<name><![CDATA[Hang Zhou]]></name>
</author>
<generator uri="http://octopress.org/">Octopress</generator>
<entry>
<title type="html"><![CDATA[Multi-tenant of HBase in FaceBook]]></title>
<link href="http://utopiazh.github.com/blog/2013/03/10/multi-tenancies-of-hbase-in-facebook/"/>
<updated>2013-03-10T09:24:00+08:00</updated>
<id>http://utopiazh.github.com/blog/2013/03/10/multi-tenancies-of-hbase-in-facebook</id>
<content type="html"><![CDATA[<h1>Overview</h1>
<p>Watched the video for 2 times to get the main point, the highlights learned from this video is:</p>
<ul>
<li>Use unique message id as ts to store into hbase</li>
<li>Use monitoring/reporting tools to know what exactly the access pattern is, while the app developers don't know either</li>
<li>Use underlying hashout solution to isolate traffic, instead of an optimistic solution of using table names.</li>
</ul>
<h1>HBase adoption</h1>
<h2>Messages:</h2>
<p><strong>Requirements:</strong></p>
<ul>
<li>Very high volume (9B+ msg, 90B r+w ops per day), 4.5 PB no compressed.</li>
<li>Elasticity & auto-failover</li>
<li>Strong consistency within a data centeer</li>
<li>import large data set (old data)</li>
</ul>
<p><strong>Schema:</strong></p>
<ul>
<li>3 CF (actions, snapshot (cache), keywords (search))</li>
<li>Use message id as timestamp, which leverage the 3rd dimension.</li>
<li>Fast iteration on schema
- Action logs are single source of truth
- Most data during schema iteration from snapshot and keywords
- Custom backup solution (action logs), which could be replayed on schema evolving.</li>
</ul>
<h2>ODS (Operational Data Store)</h2>
<ul>
<li>Schema: 3CFs, raw, hour, days</li>
<li>MR job to move data to buckets</li>
<li>Compaction tricks (leverage the time series to pick the right HFile to achieve more spatial locality)</li>
<li>Separating HBase clusters and its backing HDFS clusters proved not feasible</li>
<li>HA is essential (Master/Master cluster replication with MR jobs)</li>
</ul>
<h1>HBase best practices</h1>
<p><strong>Process:</strong></p>
<p>Select the right apps: no five nines, etc.</p>
<p>Shadow testing before going production (real traffic goes in and out without noticed by users).</p>
<p><strong>Schema Design</strong></p>
<p>One CF or Multiple CF: common issue is putting all data into one CF, so get poor spatial locality </p>
<p>New table or New CF: New CF provides better locality, e.g. users message and search, the best user experience is that all users data is complete available or down</p>
<p><strong>Lessons:</strong></p>
<ul>
<li>Need graph/reporting monitoring to provide insights</li>
<li>Constant upgrade (facebook is running HBase 0.89 with back-ported patches)</li>
</ul>
<h1>Multi-Tenants</h1>
<p><strong>optimistic multi-tenancies</strong></p>
<ul>
<li>let users do the right things</li>
<li>lack of table naming rules</li>
</ul>
<p><strong>Hashout:</strong></p>
<ul>
<li>Was for Mysql, ported to support HBase</li>
<li>review every app</li>
<li>block abusive users, promote heave users</li>
<li>users must provide app id and key, system app id to specific shard</li>
<li>backed reading ops with memcached</li>
<li>intensive MR job on separate cluster</li>
</ul>
<h1>Reference</h1>
<ul>
<li><a href="vimeo.com/44715953" title="Multi-tenant HBase Solutions at Facebook">Nicolas Spiegelberg - Multi-tenant HBase Solutions at Facebook</a></li>
</ul>
]]></content>
</entry>
</feed>