forked from openzipkin/zipkin
/
hadoop.html
62 lines (57 loc) · 2.41 KB
/
hadoop.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Zipkin, from Twitter</title>
<link href="css/bootstrap-2.1.0.min.css" rel="stylesheet">
<link href="css/docs.css" rel="stylesheet">
<!--<link href="css/bootstrap-responsive-2.0.4.min.css" rel="stylesheet">-->
</head>
<body>
<div class="navbar navbar-inverse navbar-fixed-top">
<div class="navbar-inner">
<div class="container">
<a class="brand" href="index.html">Zipkin</a>
<ul class="nav">
<li><a href="architecture.html">Architecture</a></li>
<li><a href="install.html">Install</a></li>
<li><a href="hadoop.html">Hadoop</a></li>
<li><a href="instrument.html">Instrumenting a library</a></li>
</ul>
<ul class="nav pull-right">
<li><a href="https://github.com/twitter/zipkin">Github</a></li>
<li><a href="https://twitter.com/zipkinproject">@ZipkinProject</a></li>
</ul>
</div>
</div>
</div>
<header class="jumbotron masthead">
<div class="inner">
<h1>Zipkin, from Twitter</h1>
<p class="lead">A distributed tracing system</p>
</div>
</header>
<div class="container">
<section id="examples">
<h2>Running a Hadoop job</h2>
<p>It's possible to setup Scribe to log into Hadoop. If you do this you can generate various reports from the data
that is not easy to do on the fly in Zipkin itself.</p>
<p>We use a library called <a href="http://github.com/twitter/scalding">Scalding</a> to write Hadoop jobs in Scala.</p>
<ol>
<li>To run a Hadoop job first make the fat jar.
<code>sbt 'project zipkin-hadoop' compile assembly</code></li>
<li>Change scald.rb to point to the hostname you want to copy the jar to and run the job from.</li>
<li>Update the version of the jarfile in scald.rb if needed.</li>
<li>You can then run the job using our scald.rb script.
<code>./scald.rb --hdfs com.twitter.zipkin.hadoop.[classname] --date yyyy-mm-ddThh:mm yyyy-mm-ddThh:mm --output [dir]</code></li>
</ol>
</section>
</div>
<footer class="footer">
<div class="container">
<p>Copyright 2012 Twitter, Inc.</p>
<p>Licensed under the Apache License, Version 2.0: <a href="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</a>
</div>
</footer>
</body>
</html>