-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-2839, BEAM-2838] Add MapReduce runner to Beam asf-site. #313
Conversation
Refer to this link for build results (access rights to CI server needed): Jenkins built the site at commit id 7b6a866 with Jekyll and staged it here. Happy reviewing. Note that any previous site has been deleted. This staged site will be automatically deleted after its TTL expires. Push any commit to the pull request branch or re-trigger the build to get it staged again. |
Refer to this link for build results (access rights to CI server needed): Jenkins built the site at commit id 6a3304c with Jekyll and staged it here. Happy reviewing. Note that any previous site has been deleted. This staged site will be automatically deleted after its TTL expires. Push any commit to the pull request branch or re-trigger the build to get it staged again. |
6a3304c
to
7b6a866
Compare
Refer to this link for build results (access rights to CI server needed): Jenkins built the site at commit id 7b6a866 with Jekyll and staged it here. Happy reviewing. Note that any previous site has been deleted. This staged site will be automatically deleted after its TTL expires. Push any commit to the pull request branch or re-trigger the build to get it staged again. |
7b6a866
to
04a5d2d
Compare
Refer to this link for build results (access rights to CI server needed): Jenkins built the site at commit id 04a5d2d with Jekyll and staged it here. Happy reviewing. Note that any previous site has been deleted. This staged site will be automatically deleted after its TTL expires. Push any commit to the pull request branch or re-trigger the build to get it staged again. |
04a5d2d
to
444c04d
Compare
Refer to this link for build results (access rights to CI server needed): Jenkins built the site at commit id 444c04d with Jekyll and staged it here. Happy reviewing. Note that any previous site has been deleted. This staged site will be automatically deleted after its TTL expires. Push any commit to the pull request branch or re-trigger the build to get it staged again. |
444c04d
to
5fead25
Compare
R: @kennknowles |
Refer to this link for build results (access rights to CI server needed): Jenkins built the site at commit id 5fead25 with Jekyll and staged it here. Happy reviewing. Note that any previous site has been deleted. This staged site will be automatically deleted after its TTL expires. Push any commit to the pull request branch or re-trigger the build to get it staged again. |
R: @melap On the overview page, we just list the runners that are on master. |
src/_data/capability-matrix.yml
Outdated
@@ -11,6 +11,8 @@ columns: | |||
name: Apache Apex | |||
- class: gearpump | |||
name: Apache Gearpump | |||
- class: mapreduce | |||
name: MapReduce |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apache Hadoop MapReduce
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
made some minor edit suggestions below.
one general comment along the same lines as @kennknowles -- are we adding runners not in master to the main menu pulldown? I'm not sure. I ask as header.html will need a line item for this in the runners section, otherwise I don't think people can get to it. Or alternately, an additional column in the table on https://beam.apache.org/contribute/work-in-progress/ for links to the feature branch runner's pages?
|
||
The Apache Hadoop MapReduce runner currently supports Apache Hadoop 2.8.1 version. | ||
|
||
You can add a dependency on the latest version of the Apache Hadoop MapReduce runner by adding to your pom.xml the following: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by adding the following to your pom.xml:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
## Apache Hadoop MapReduce Runner prerequisites and setup | ||
You need to have an Apache Hadoop environment with either [Single Node Setup](https://hadoop.apache.org/docs/r1.2.1/single_node_setup.html) or [Cluster Setup](https://hadoop.apache.org/docs/r1.2.1/cluster_setup.html) | ||
|
||
The Apache Hadoop MapReduce runner currently supports Apache Hadoop 2.8.1 version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apache Hadoop version 2.8.1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
The Apache Hadoop MapReduce runner currently supports Apache Hadoop 2.8.1 version. | ||
|
||
You can add a dependency on the latest version of the Apache Hadoop MapReduce runner by adding to your pom.xml the following: | ||
```java |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend making this a regular block with just ``` ... sometimes the language toggling can be strange and it might be blank if the user chose python SDK on another page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
``` | ||
|
||
## Deploying Apache Hadoop MapReduce with your application | ||
To execute in a local hadoop environment, use this command: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
capitalize Hadoop (multiple places on the page)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
--fileOutputDir=<directory for intermediate outputs>" | ||
``` | ||
|
||
To execute in a hadoop cluster, you need to package your program along will all dependencies in a so-called fat jar. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can remove "you need to"
will -> with
remove "so-called"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
<tr> | ||
<td><code>runner</code></td> | ||
<td>The pipeline runner to use. This option allows you to determine the pipeline runner at runtime.</td> | ||
<td>Set to <code>MapReduceRunner</code> to run using the Apache Hadoop MapReduce.</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove "the" after using
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
</tr> | ||
<tr> | ||
<td><code>fileOutputDir</code></td> | ||
<td>The directory for files output.</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
files output -> output files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
To execute in a hadoop cluster, you need to package your program along will all dependencies in a so-called fat jar. | ||
|
||
If you follow along the [Beam Quickstart]({{ site.baseurl }}/get-started/quickstart/) this is the command that you can run: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this quickstart link is a redirect to the Java page.. perhaps link directly to the Java version and text something like this?
If you are using through the [Beam Java SDK Quickstart]({{ site.baseurl }}/get-started/quickstart-java/), you can run this command:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done with a minor change to "following through"
src/get-started/beam-overview.md
Outdated
@@ -36,6 +36,8 @@ Beam currently supports Runners that work with the following distributed process | |||
alt="Apache Flink"> | |||
* Apache Gearpump (incubating) <img src="{{ site.baseurl }}/images/logos/runners/gearpump.png" | |||
alt="Apache Gearpump"> | |||
* Apache Hadoop MapReduce <img src="{{ site.baseurl }}/images/logos/runners/mapreduce.png" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this image is very big compared to the others on the page... may need to specify dimensions or use a smaller image. more a comment for later, as from Kenn's comment, it sounds like this runner shoudn't be listed here yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL
I have to update header.html in a follow-up PR. Otherwise, it will complain about dead link.
src/_data/capability-matrix.yml
Outdated
@@ -11,6 +11,8 @@ columns: | |||
name: Apache Apex | |||
- class: gearpump | |||
name: Apache Gearpump | |||
- class: mapreduce | |||
name: MapReduce |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
## Apache Hadoop MapReduce Runner prerequisites and setup | ||
You need to have an Apache Hadoop environment with either [Single Node Setup](https://hadoop.apache.org/docs/r1.2.1/single_node_setup.html) or [Cluster Setup](https://hadoop.apache.org/docs/r1.2.1/cluster_setup.html) | ||
|
||
The Apache Hadoop MapReduce runner currently supports Apache Hadoop 2.8.1 version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
The Apache Hadoop MapReduce runner currently supports Apache Hadoop 2.8.1 version. | ||
|
||
You can add a dependency on the latest version of the Apache Hadoop MapReduce runner by adding to your pom.xml the following: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
The Apache Hadoop MapReduce runner currently supports Apache Hadoop 2.8.1 version. | ||
|
||
You can add a dependency on the latest version of the Apache Hadoop MapReduce runner by adding to your pom.xml the following: | ||
```java |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
``` | ||
|
||
## Deploying Apache Hadoop MapReduce with your application | ||
To execute in a local hadoop environment, use this command: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
--fileOutputDir=<directory for intermediate outputs>" | ||
``` | ||
|
||
To execute in a hadoop cluster, you need to package your program along will all dependencies in a so-called fat jar. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
<tr> | ||
<td><code>runner</code></td> | ||
<td>The pipeline runner to use. This option allows you to determine the pipeline runner at runtime.</td> | ||
<td>Set to <code>MapReduceRunner</code> to run using the Apache Hadoop MapReduce.</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
To execute in a hadoop cluster, you need to package your program along will all dependencies in a so-called fat jar. | ||
|
||
If you follow along the [Beam Quickstart]({{ site.baseurl }}/get-started/quickstart/) this is the command that you can run: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done with a minor change to "following through"
</tr> | ||
<tr> | ||
<td><code>fileOutputDir</code></td> | ||
<td>The directory for files output.</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/get-started/beam-overview.md
Outdated
@@ -36,6 +36,8 @@ Beam currently supports Runners that work with the following distributed process | |||
alt="Apache Flink"> | |||
* Apache Gearpump (incubating) <img src="{{ site.baseurl }}/images/logos/runners/gearpump.png" | |||
alt="Apache Gearpump"> | |||
* Apache Hadoop MapReduce <img src="{{ site.baseurl }}/images/logos/runners/mapreduce.png" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Refer to this link for build results (access rights to CI server needed): Jenkins built the site at commit id f95c84b with Jekyll and staged it here. Happy reviewing. Note that any previous site has been deleted. This staged site will be automatically deleted after its TTL expires. Push any commit to the pull request branch or re-trigger the build to get it staged again. |
- class: mapreduce | ||
l1: 'Yes' | ||
l2: fully supported | ||
l3: '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normally, I'd want one sentence about the MapReduce feature used to implement them, but I think with the upcoming revamp it is not worth filling it out. We should have this page revised before the runner is on master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
@@ -505,6 +575,10 @@ categories: | |||
l1: 'No' | |||
l2: '' | |||
l3: '' | |||
- class: mapreduce | |||
l1: 'Yes' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the most useful thing for users here is to say "No" and indicate that it is a batch-only runner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -532,6 +606,10 @@ categories: | |||
l1: 'Yes' | |||
l2: fully supported | |||
l3: '' | |||
- class: mapreduce | |||
l1: 'Yes' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once you say 'No' to triggering, you can leave these blank or just say 'No' without the explanation. It is a nice explanation, though, so you could also leave it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -668,6 +762,10 @@ categories: | |||
l1: 'Yes' | |||
l2: fully supported | |||
l3: '' | |||
- class: mapreduce | |||
l1: 'Yes' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This stuff is another area where it doesn't really make sense for a batch-only runner to have any answer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
layout: default | ||
title: "Apache Hadoop MapReduce Runner" | ||
permalink: /documentation/runners/mapreduce/ | ||
redirect_from: /learn/runners/mapreduce/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't need this, since there's no past URL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
--- | ||
layout: default | ||
title: "Apache Hadoop MapReduce Runner" | ||
permalink: /documentation/runners/mapreduce/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think since we will not link from the overview and will not link from the header, you should follow this up by adding a link from the work-in-progress page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL
- class: mapreduce | ||
l1: 'Yes' | ||
l2: fully supported | ||
l3: '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
@@ -505,6 +575,10 @@ categories: | |||
l1: 'No' | |||
l2: '' | |||
l3: '' | |||
- class: mapreduce | |||
l1: 'Yes' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -668,6 +762,10 @@ categories: | |||
l1: 'Yes' | |||
l2: fully supported | |||
l3: '' | |||
- class: mapreduce | |||
l1: 'Yes' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -532,6 +606,10 @@ categories: | |||
l1: 'Yes' | |||
l2: fully supported | |||
l3: '' | |||
- class: mapreduce | |||
l1: 'Yes' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
--- | ||
layout: default | ||
title: "Apache Hadoop MapReduce Runner" | ||
permalink: /documentation/runners/mapreduce/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
layout: default | ||
title: "Apache Hadoop MapReduce Runner" | ||
permalink: /documentation/runners/mapreduce/ | ||
redirect_from: /learn/runners/mapreduce/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Refer to this link for build results (access rights to CI server needed): Jenkins built the site at commit id f2ae6e6 with Jekyll and staged it here. Happy reviewing. Note that any previous site has been deleted. This staged site will be automatically deleted after its TTL expires. Push any commit to the pull request branch or re-trigger the build to get it staged again. |
Looks mostly good. When I expand details on the capability matrix, the formatting is a bit weird. There are some boxes that start with a |
PTAL Added "No" back, and it should fix the ':' issue |
Refer to this link for build results (access rights to CI server needed): Jenkins built the site at commit id fa0c7a1 with Jekyll and staged it here. Happy reviewing. Note that any previous site has been deleted. This staged site will be automatically deleted after its TTL expires. Push any commit to the pull request branch or re-trigger the build to get it staged again. |
LGTM from a text perspective |
Actually, one thing -- does the capability matrix need to note somewhere that this runner is in development and isn't part of the release yet? |
We had gearpump in the capability matrix before merging it to master. PTAL @kennknowles |
It would be nice to also call it out on the capability matrix page, but I don't think that is so critical. Mostly the header and overview are where we should stick to released runners. We need a major revamp there anyhow, and that will give a good opportunity to add verbiage about in-progress runners. Let's get this added to the site so that anyone dropping in might see it. |
@asfgit merge |
No description provided.