Skip to content
This repository has been archived by the owner on Jan 23, 2020. It is now read-only.

Commit

Permalink
Updates README.md files
Browse files Browse the repository at this point in the history
  • Loading branch information
sagemintblue committed Jan 12, 2014
1 parent 2d2e80b commit 470b9b4
Show file tree
Hide file tree
Showing 5 changed files with 72 additions and 51 deletions.
65 changes: 30 additions & 35 deletions README.md
Expand Up @@ -3,25 +3,25 @@
Twitter Ambrose is a platform for visualization and real-time monitoring of MapReduce data workflows.
It presents a global view of all the map-reduce jobs derived from your workflow after planning and
optimization. As jobs are submitted for execution on your Hadoop cluster, Ambrose updates its
visualization to reflect the latest job status, polled from your process.
visualization to reflect the latest job status.

Ambrose provides the following in a web UI:

* A table view of all the associated jobs, along with their current state
* Chord and graph diagrams to visualize job dependencies and current state
* An overall script progress bar
* Visual weighting of jobs based on resource consumption
* Visual weighting of vectors based on data volume
* A workflow progress bar depicting percent completion of the entire workflow
* A table view of all workflow jobs, along with their current state
* A graph diagram which depicts job dependencies and metrics
* Visual weighting of jobs based on resource consumption
* Visual weighting of job dependencies based on data volume
* Script view with line highlighting (Pig only)

Ambrose is built using the following front-end technologies:

* [D3.js](http://d3js.org) - For diagram generation
* [Bootstrap](http://twitter.github.com/bootstrap/) - For layout and CSS support
* [jQuery](http://jquery.com), [UnderscoreJS](http://underscorejs.com), [RequireJS](http://requirejs.org) - Core javascript libraries and JS module definition
* [D3.js](http://d3js.org) - Diagram generation
* [Bootstrap](http://getbootstrap.com/) - Layout and CSS support

Ambrose is designed to support any workflow runtime, but currently supports [Apache
Pig](http://pig.apache.org/), [Hive](http://hive.apache.org/), [Cascading](http://www.cascading.org/)
and [Scalding](https://github.com/twitter/scalding).
Ambrose is designed to support any workflow runtime. See the following section for supported
runtimes.

Follow [@Ambrose](https://twitter.com/ambrose) on Twitter to stay in touch!

Expand All @@ -30,24 +30,23 @@ Follow [@Ambrose](https://twitter.com/ambrose) on Twitter to stay in touch!
* [Pig](http://pig.apache.org/) - See [pig/README.md](https://github.com/twitter/ambrose/blob/master/pig/README.md)
* [Hive](http://hive.apache.org/) - See [hive/README.md](https://github.com/twitter/ambrose/blob/master/hive/README.md)
* [Cascading](http://www.cascading.org/) - See [cascading/README.md](https://github.com/twitter/ambrose/blob/master/cascading/README.md)
* [Scalding](https://github.com/twitter/scalding) - See [cascading/README.md](https://github.com/twitter/ambrose/blob/master/cascading/README.md)
* [Scalding](https://github.com/twitter/scalding) - See [scalding/README.md](https://github.com/twitter/ambrose/blob/master/scalding/README.md)
* [Cascalog](https://github.com/nathanmarz/cascalog) - future work

## Examples

Below is a screenshot of the Ambrose UI. The interface presents multiple responsive "views" of a
single workflow. Just beneath the toolbar at the top of the screen is a workflow progress bar that
tracks overall completion percentage of the workflow. Below the progress bar are two diagrams which
depict the workflow's jobs and their dependencies. Below the diagrams is a table of workflow
Below is a screenshot of the Ambrose workflow UI. The interface presents multiple responsive "views"
of a single workflow. Just beneath the toolbar at the top of the window is a workflow progress bar
that tracks overall completion of the workflow. Below the progress bar is a graph diagrams which
depicts the workflow's jobs and their dependencies. Below the graph diagram is a table of workflow
jobs.

All views react to mouseover and click events on a job, regardless of the view on which the event is
triggered; Moving your mouse over the first row of the table will highlight that job row along with
the associated job arc in the chord diagram and job node in the graph diagram. Clicking on a job in
any view will select it, updating the highlighting of that job in all views. Clicking twice on the
same job will deselect it.
triggered; Moving your mouse over the first row of the table will highlight that job's table row
along with the job's node in the graph diagram. Clicking on a job in any view will select it,
updating the highlighting of that job in all views. Clicking again on the same job will deselect it.

![Ambrose UI screenshot](https://github.com/twitter/ambrose/raw/master/docs/img/ambrose-ss1.png)
![Ambrose workflow screenshot](https://github.com/twitter/ambrose/raw/master/docs/img/ambrose-ss1.png)

## Quickstart

Expand All @@ -67,33 +66,29 @@ command and then browse to
./bin/ambrose-demo
```

To run Ambrose with an actual Pig script, you'll need to build the Ambrose Pig distribution:
To run Ambrose with a Pig script, you'll need to build the Ambrose Pig distribution:

```
mvn package
```

You can then run the following commands to execute `path/to/my/script.pig` with an Ambrose app server
embedded within the Pig client:
You can then run the following commands to execute `script.pig` with an embedded web server which
hosts the Ambrose web application:

```
cd pig/target/ambrose-pig-$VERSION-bin/ambrose-pig-$VERSION
./bin/pig-ambrose -f path/to/my/script.pig
AMBROSE_PORT=8080 ./bin/pig-ambrose -f script.pig
```

Note that this command delegates to the `pig` script present in your local installation of Pig, so
make sure `$PIG_HOME/bin` is in your path. Now, browse to
[http://localhost:8080/web/workflow.html](http://localhost:8080/web/workflow.html) to see the
progress of your script using the Ambrose UI. To override the default port, export `AMBROSE_PORT`
before invoking `pig-ambrose`:

```
export AMBROSE_PORT=4567
```
Note that the `pig-ambrose` script calls the `pig` script present in your local installation of Pig,
so make sure `$PIG_HOME/bin` is in your path. Now, browse to
[http://localhost:8080/web/workflow.html](http://localhost:8080/workflow.html) to see the progress
of your script with the Ambrose workflow UI.

## Maven repository

Ambrose releases can be found on Maven under [com.twitter.ambrose](http://repo1.maven.org/maven2/com/twitter/ambrose).
Ambrose releases can be found in the Maven Central Repository within package
[com.twitter.ambrose](http://central.maven.org/maven2/com/twitter/ambrose).

## How to contribute

Expand Down
38 changes: 25 additions & 13 deletions cascading/README.md
@@ -1,29 +1,41 @@
# Ambrose Cascading Support

## Implementation
## Implementation

Cascading integrates with Ambrose via the ```AmbroseCascadingNotificationListener``` class. Cascading starts an
embedded [Jetty](http://jetty.codehaus.org/jetty/) server that exposes job information to the Ambrose Web server.
For more information on Cascading see [Cascading Getting Started]([http://www.cascading.org/documentation/).
Ambrose integrates with Cascading via Cascading's `FlowListener` and `FlowStepListener`
interfaces. The
[`AmbroseCascadingNotifier`](https://github.com/twitter/ambrose/blob/master/cascading/src/main/java/com/twitter/ambrose/cascading/AmbroseCascadingNotifier.java)
implements both of these interfaces, and passes Cascading flow events on to an Ambrose
[`StatsWriteService`](https://github.com/twitter/ambrose/blob/master/common/src/main/java/com/twitter/ambrose/service/StatsWriteService.java). For
more information on Cascading see [Cascading Getting
Started]([http://www.cascading.org/documentation/).

To run Ambrose with a Cascading program add the following code at the end of Cascading main:
The
[`EmbeddedAmbroseCascadingNotifier`](https://github.com/twitter/ambrose/blob/master/cascading/src/main/java/com/twitter/ambrose/cascading/EmbeddedAmbroseCascadingNotifier.java),
which extends `AmbroseCascadingNotifier`, records flow state in memory, and starts an embedded
[Jetty](http://www.eclipse.org/jetty/) web server that hosts the Ambrose web application.

To use the `EmbeddedAmbroseCascadingNotifier` in your Cascading program, add the following code at
the end of Cascading main:

```
// creates the embedded cascading listener before tfidfFlow.complete();
EmbeddedAmbroseCascadingProgressNotificationListener server = new EmbeddedAmbroseCascadingProgressNotificationListener();
// creates the embedded cascading notifier before tfidfFlow.complete();
EmbeddedAmbroseCascadingNotifier server = new EmbeddedAmbroseCascadingNotifier();
```

Then, add the listeners to the Flow:
Then, add the listeners to your Flow:

```
tfidfFlow.addListener(server);
tfidfFlow.addStepListener(server);
tfidfFlow.complete();
flow.addListener(server);
flow.addStepListener(server);
flow.complete();
```

Note: ```tfidfFlow``` is the Flow of the Cascading example
[cascading for the impatient part 5](http://www.cascading.org/2012/07/31/cascading-for-the-impatient-part-5/).
When your Cascading program executes, the embedded Jetty web server will (by default) bind to
localhost port 8080, allowing you to browse to http://localhost:8080/ to see the Ambrose web
application and its visualization of workflow state.

## Authors

* [Ahmed Mohsen](https://github.com/Ahmed--Mohsen) ([@Ahmed__Mohsen](https://twitter.com/Ahmed__Mohsen))
* [Ahmed Eshra](https://github.com/engeshra) ([@engeshra](https://twitter.com/engeshra))
Binary file modified docs/img/ambrose-ss1.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 4 additions & 3 deletions pig/README.md
Expand Up @@ -3,9 +3,10 @@
## Implementation

Ambrose integrates with Pig 0.11.0+ via Pig's `PigProgressNotificationListener` (PPNL)
interface. The `./bin/pig-ambrose` script launches Pig with the Ambrose implementation of PPNL. This
implementation starts an embedded [Jetty](http://jetty.codehaus.org/jetty/) server that exposes job
runtime information to the Ambrose web UI.
interface. The `./bin/pig-ambrose` script launches Pig with the
[`EmbeddedAmbrosePigProgressNotificationListener`](https://github.com/twitter/ambrose/blob/master/pig/src/main/java/com/twitter/ambrose/pig/EmbeddedAmbrosePigProgressNotificationListener.java). This
PPNL records Pig workflow state in memory, and starts an embedded
[Jetty](http://www.eclipse.org/jetty/) web server that hosts the Ambrose web application.

## Known issues

Expand Down
13 changes: 13 additions & 0 deletions scalding/README.md
@@ -0,0 +1,13 @@
# Ambrose Scalding Support

## Usage

To enable the embedded Ambrose server in your Scalding job, add the
[`AmbroseAdapter`](https://github.com/twitter/ambrose/blob/master/scalding/src/main/scala/com/twitter/ambrose/scalding/AmbroseAdapter.scala)
trait to it. Then, while the job is running, browse to http://localhost:8080/. See
[`EmbeddedAmbroseCascadingNotifier`](https://github.com/twitter/ambrose/blob/master/cascading/src/main/java/com/twitter/ambrose/cascading/EmbeddedAmbroseCascadingNotifier.java)
for details on configuring the port, etc.

## Authors

* [twdima](https://github.com/twdima) ([@dimatkach69](https://twitter.com/dimatkach69))

0 comments on commit 470b9b4

Please sign in to comment.