From 35de514544fb1bdde13253283a6f35ee06b1af60 Mon Sep 17 00:00:00 2001
From: Ufuk Celebi <uce@apache.org>
Date: Thu, 24 Dec 2015 01:19:02 +0100
Subject: [PATCH] [FLINK-3132] [docs] Initial docs restructure

---
 docs/README.md                                | 111 ++-
 docs/_config.yml                              |  11 +-
 docs/_includes/navbar.html                    |  74 +-
 docs/_layouts/base.html                       |   4 -
 docs/_layouts/plain.html                      |  86 +-
 docs/_plugins/highlightCode.rb                |   3 +-
 docs/_plugins/info.rb                         |  20 +
 docs/_plugins/top.rb                          |  14 +
 docs/_plugins/warn.rb                         |  20 +
 .../{ => batch}/dataset_transformations.md    |  26 +-
 docs/apis/{ => batch}/examples.md             |  57 +-
 docs/apis/batch/fault_tolerance.md            | 100 +++
 docs/apis/{ => batch}/fig/LICENSE.txt         |   0
 .../fig/iterations_delta_iterate_operator.png | Bin
 ...rations_delta_iterate_operator_example.png | Bin
 .../fig/iterations_iterate_operator.png       | Bin
 .../iterations_iterate_operator_example.png   | Bin
 .../{ => batch}/fig/iterations_supersteps.png | Bin
 docs/apis/{ => batch}/fig/plan_visualizer.png | Bin
 docs/apis/{ => batch}/hadoop_compatibility.md |  29 +-
 .../{programming_guide.md => batch/index.md}  | 236 ++---
 docs/apis/{ => batch}/iterations.md           |  10 +-
 docs/apis/{ => batch}/python.md               |  78 +-
 docs/apis/best_practices.md                   |   7 +-
 docs/apis/cli.md                              |   3 +
 docs/apis/cluster_execution.md                |   7 +-
 .../{example_connectors.md => connectors.md}  |  10 +-
 docs/apis/filesystems.md                      | 236 +++++
 docs/apis/index.md                            |   2 +-
 docs/apis/java8.md                            |  38 +-
 docs/apis/local_execution.md                  |   9 +-
 docs/apis/scala_shell.md                      |   8 +-
 docs/apis/streaming/connectors/docker.md      | 116 +++
 .../streaming/connectors/elasticsearch.md     | 165 ++++
 docs/apis/streaming/connectors/hdfs.md        | 115 +++
 docs/apis/streaming/connectors/index.md       |  26 +
 docs/apis/streaming/connectors/kafka.md       | 160 ++++
 docs/apis/streaming/connectors/rabbitmq.md    | 102 +++
 docs/apis/streaming/connectors/twitter.md     |  89 ++
 docs/apis/{ => streaming}/fault_tolerance.md  |  83 +-
 docs/apis/streaming/fig/LICENSE.txt           |  17 +
 .../fig/savepoints-overview.png               | Bin
 .../fig/savepoints-program_ids.png            | Bin
 .../index.md}                                 | 816 +-----------------
 docs/apis/{ => streaming}/savepoints.md       |   2 +
 docs/apis/{ => streaming}/state_backends.md   |   6 +-
 .../{ => streaming}/storm_compatibility.md    |   4 +-
 docs/apis/web_client.md                       |   3 +
 docs/apis/zip_elements_guide.md               | 106 ---
 docs/internals/add_operator.md                |   4 +
 docs/internals/general_arch.md                |   4 +
 docs/internals/ide_setup.md                   |   5 +-
 docs/internals/job_scheduling.md              |   3 +
 docs/internals/logging.md                     |   4 +
 docs/internals/monitoring_rest_api.md         |   3 +
 docs/internals/stream_checkpointing.md        |   4 +
 docs/internals/types_serialization.md         |   3 +
 docs/libs/gelly_guide.md                      | 105 +--
 docs/libs/index.md                            |  10 +-
 docs/libs/ml/als.md                           |   8 +-
 docs/libs/ml/contribution_guide.md            |   8 +-
 docs/libs/ml/distance_metrics.md              |   8 +-
 docs/libs/ml/index.md                         |  12 +-
 docs/libs/ml/min_max_scaler.md                |   6 +-
 docs/libs/ml/multiple_linear_regression.md    |   8 +-
 docs/libs/ml/optimization.md                  |   7 +-
 docs/libs/ml/pipelines.md                     |   7 +-
 docs/libs/ml/polynomial_features.md           |   7 +-
 docs/libs/ml/quickstart.md                    |   7 +-
 docs/libs/ml/standard_scaler.md               |   7 +-
 docs/libs/ml/svm.md                           |   7 +-
 docs/libs/table.md                            |  12 +-
 docs/page/css/flink.css                       | 120 ++-
 docs/quickstart/java_api_quickstart.md        |   4 +
 docs/quickstart/run_example_quickstart.md     |   4 +
 docs/quickstart/scala_api_quickstart.md       |   4 +
 docs/quickstart/setup_quickstart.md           |   4 +
 docs/setup/building.md                        | 110 +--
 docs/setup/cluster_setup.md                   | 301 +------
 docs/setup/config.md                          | 464 +++-------
 docs/setup/flink_on_tez.md                    | 290 -------
 docs/setup/gce_setup.md                       |  27 +-
 docs/setup/jobmanager_high_availability.md    |   3 +
 docs/setup/local_setup.md                     |  25 +-
 docs/setup/yarn_setup.md                      |  46 +-
 85 files changed, 2271 insertions(+), 2389 deletions(-)
 create mode 100644 docs/_plugins/info.rb
 create mode 100644 docs/_plugins/top.rb
 create mode 100644 docs/_plugins/warn.rb
 rename docs/apis/{ => batch}/dataset_transformations.md (99%)
 rename docs/apis/{ => batch}/examples.md (96%)
 create mode 100644 docs/apis/batch/fault_tolerance.md
 rename docs/apis/{ => batch}/fig/LICENSE.txt (100%)
 rename docs/apis/{ => batch}/fig/iterations_delta_iterate_operator.png (100%)
 rename docs/apis/{ => batch}/fig/iterations_delta_iterate_operator_example.png (100%)
 rename docs/apis/{ => batch}/fig/iterations_iterate_operator.png (100%)
 rename docs/apis/{ => batch}/fig/iterations_iterate_operator_example.png (100%)
 rename docs/apis/{ => batch}/fig/iterations_supersteps.png (100%)
 rename docs/apis/{ => batch}/fig/plan_visualizer.png (100%)
 rename docs/apis/{ => batch}/hadoop_compatibility.md (95%)
 rename docs/apis/{programming_guide.md => batch/index.md} (97%)
 rename docs/apis/{ => batch}/iterations.md (95%)
 rename docs/apis/{ => batch}/python.md (95%)
 rename docs/apis/{example_connectors.md => connectors.md} (98%)
 create mode 100644 docs/apis/filesystems.md
 create mode 100644 docs/apis/streaming/connectors/docker.md
 create mode 100644 docs/apis/streaming/connectors/elasticsearch.md
 create mode 100644 docs/apis/streaming/connectors/hdfs.md
 create mode 100644 docs/apis/streaming/connectors/index.md
 create mode 100644 docs/apis/streaming/connectors/kafka.md
 create mode 100644 docs/apis/streaming/connectors/rabbitmq.md
 create mode 100644 docs/apis/streaming/connectors/twitter.md
 rename docs/apis/{ => streaming}/fault_tolerance.md (74%)
 create mode 100644 docs/apis/streaming/fig/LICENSE.txt
 rename docs/apis/{ => streaming}/fig/savepoints-overview.png (100%)
 rename docs/apis/{ => streaming}/fig/savepoints-program_ids.png (100%)
 rename docs/apis/{streaming_guide.md => streaming/index.md} (79%)
 rename docs/apis/{ => streaming}/savepoints.md (99%)
 rename docs/apis/{ => streaming}/state_backends.md (98%)
 rename docs/apis/{ => streaming}/storm_compatibility.md (99%)
 delete mode 100644 docs/apis/zip_elements_guide.md
 delete mode 100644 docs/setup/flink_on_tez.md

diff --git a/docs/README.md b/docs/README.md
index 05dcecb742541..d37dc77ae9fdd 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -6,9 +6,9 @@ http://flink.apache.org/ is also generated from the files found here.
 
 # Requirements
 
-We use Markdown to write and Jekyll to translate the documentation to static HTML. Kramdown is 
+We use Markdown to write and Jekyll to translate the documentation to static HTML. Kramdown is
 needed for Markdown processing and the Python based Pygments is used for syntax highlighting. To run
-Javascript code from Ruby, you need to install a javascript runtime (e.g. `therubyracer`). You can 
+Javascript code from Ruby, you need to install a javascript runtime (e.g. `therubyracer`). You can
 install all needed software via the following commands:
 
     gem install jekyll -v 2.5.3
@@ -16,13 +16,13 @@ install all needed software via the following commands:
     gem install pygments.rb -v 0.6.3
     gem install therubyracer -v 0.12.2
     sudo easy_install Pygments
-    
-Note that in Ubuntu based systems, it may be necessary to install the `ruby-dev` and 
+
+Note that in Ubuntu based systems, it may be necessary to install the `ruby-dev` and
 `python-setuptools` packages via apt.
 
 # Using Dockerized Jekyll
 
-We dockerized the jekyll environment above. If you have [docker](https://docs.docker.com/), 
+We dockerized the jekyll environment above. If you have [docker](https://docs.docker.com/),
 you can run following command to start the container.
 
 ```
@@ -33,7 +33,6 @@ cd flink/docs/docker
 It takes a few moment to build the image for the first time, but will be a second from the second time.
 The run.sh command brings you in a bash session where you can run following doc commands.
 
-
 # Build
 
 The `docs/build_docs.sh` script calls Jekyll and generates the documentation in `docs/target`. You
@@ -44,12 +43,13 @@ If you call the script with the preview flag `build_docs.sh -p`, Jekyll will sta
 
 # Contribute
 
-The documentation pages are written in
-[Markdown](http://daringfireball.net/projects/markdown/syntax). It is possible to use the
-[GitHub flavored syntax](http://github.github.com/github-flavored-markdown) and intermix plain html.
+## Markdown
+
+The documentation pages are written in [Markdown](http://daringfireball.net/projects/markdown/syntax). It is possible to use [GitHub flavored syntax](http://github.github.com/github-flavored-markdown) and intermix plain html.
 
-In addition to Markdown, every page contains a Jekyll front matter, which specifies the title of the
-page and the layout to use. The title is used as the top-level heading for the page.
+## Front matter
+
+In addition to Markdown, every page contains a Jekyll front matter, which specifies the title of the page and the layout to use. The title is used as the top-level heading for the page. The default layout is `plain` (found in `_layouts`).
 
     ---
     title: "Title of the Page"
@@ -59,20 +59,93 @@ Furthermore, you can access variables found in `docs/_config.yml` as follows:
 
     {{ site.NAME }}
 
-This will be replaced with the value of the variable called `NAME` when generating
-the docs.
+This will be replaced with the value of the variable called `NAME` when generating the docs.
+
+## Structure
 
-All documents are structed with headings. From these heading, a page outline is
-automatically generated for each page.
+### Page
+
+#### Headings
+
+All documents are structured with headings. From these headings, you can automatically generate a page table of contents (see below).
 
 ```
-# Level-1 Heading  <- Used for the title of the page
+# Level-1 Heading  <- Used for the title of the page (don't use this)
 ## Level-2 Heading <- Start with this one
 ### Level-3 heading
 #### Level-4 heading
 ##### Level-5 heading
 ```
 
-Please stick to the "logical order" when using the headlines, e.g. start with level-2 headings and
-use level-3 headings for subsections, etc. Don't use a different ordering, because you don't like
-how a headline looks.
+Please stick to the "logical order" when using the headlines, e.g. start with level-2 headings and use level-3 headings for subsections, etc. Don't use a different ordering, because you don't like how a headline looks.
+
+#### Table of Contents
+
+    * This will be replaced by the TOC
+    {:toc}
+
+
+Add this markup (both lines) to the document in order to generate a table of contents for the page. Headings until level 3 headings are included.
+
+You can exclude a heading from the table of contents:
+
+    # Excluded heading
+    {:.no_toc}
+
+#### Back to Top
+
+	{% top %}
+
+This will be replaced by a default back to top link. It is recommended to use these links at least at the end of each level-2 section.
+
+#### Labels
+
+	{% info %}
+	{% warn %}
+
+These will be replaced by a info or warning label. You can change the text of the label by providing an argument:
+
+    {% info Recommendation %}
+
+### Documentation
+
+#### Top Navigation
+
+You can modify the top-level navigation in two places. You can either edit the `_includes/navbar.html` file or add tags to your page frontmatter (recommended).
+
+    # Top-level navigation
+    top-nav-group: apis
+    top-nav-pos: 2
+    top-nav-title: <strong>Batch Guide</strong> (DataSet API)
+
+This adds the page to the group `apis` (via `top-nav-group`) at position `2` (via `top-nav-pos`). Furthermore, it specifies a custom title for the navigation via `top-nav-title`. If this field is missing, the regular page title (via `title`) will be used. If no position is specified, the element will be added to the end of the group. If no group is specified, the page will not show up.
+
+Currently, there are groups `quickstart`, `setup`, `deployment`, `apis`, `libs`, and `internals`.
+
+#### Sub Navigation
+
+A sub navigation is shown if the field `sub-nav-group` is specified. A sub navigation groups all pages with the same `sub-nav-group`. Check out the streaming or batch guide as an example.
+
+    # Sub-level navigation
+    sub-nav-group: batch
+    sub-nav-id: dataset_api
+    sub-nav-pos: 1
+    sub-nav-title: DataSet API
+
+The fields work similar to their `top-nav-*` counterparts.
+
+In addition, you can specify a hierarchy via `sub-nav-id` and `sub-nav-parent`:
+
+    # Sub-level navigation
+    sub-nav-group: batch
+    sub-nav-parent: dataset_api
+    sub-nav-pos: 1
+    sub-nav-title: Transformations
+
+This will show the `Transformations` page under the `DataSet API` page. The `sub-nav-parent` field has to have a matching `sub-nav-id`.
+
+#### Breadcrumbs
+
+Pages with sub navigations can use breadcrumbs like `Batch Guide > Libraries > Machine Learning > Optimization`.
+
+The breadcrumbs for the last page are generated from the front matter. For the a sub navigation root to appear (like `Batch Guide` in the example above), you have to specify the `sub-nav-group-title`. This field designates a group page as the root.
diff --git a/docs/_config.yml b/docs/_config.yml
index 98fb5059d5a18..6b93bfc8bd028 100644
--- a/docs/_config.yml
+++ b/docs/_config.yml
@@ -5,9 +5,9 @@
 # to you under the Apache License, Version 2.0 (the
 # "License"); you may not use this file except in compliance
 # with the License.  You may obtain a copy of the License at
-# 
+#
 # http://www.apache.org/licenses/LICENSE-2.0
-# 
+#
 # Unless required by applicable law or agreed to in writing,
 # software distributed under the License is distributed on an
 # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
@@ -22,7 +22,6 @@
 #     {{ site.CONFIG_KEY }}
 #------------------------------------------------------------------------------
 
-
 # This are the version referenced in the docs. Please only use these variables
 # to reference a specific Flink version, because this is the only place where
 # we change the version for the complete docs when forking of a release branch
@@ -56,12 +55,16 @@ defaults:
       path: ""
     values:
       layout: plain
+      top-nav-pos: 99999 # Move to end
+      sub-nav-pos: 99999 # Move to end
 
 markdown: KramdownPygments
 highlighter: pygments
 
 kramdown:
-    toc_levels: 1..3
+  input: GFM # GitHub syntax
+  hard_wrap: false # Don't translate new lines to <br>s
+  toc_levels: 1..3 # Include h1-h3 for ToC
 
 host: 0.0.0.0
 
diff --git a/docs/_includes/navbar.html b/docs/_includes/navbar.html
index 91ba62af1a868..ea78d456d7bff 100644
--- a/docs/_includes/navbar.html
+++ b/docs/_includes/navbar.html
@@ -39,16 +39,16 @@
         <!-- The navigation links. -->
         <div class="collapse navbar-collapse" id="bs-example-navbar-collapse-1">
           <ul class="nav navbar-nav">
-            <li{% if page.url == '/' %} class="active"{% endif %}><a href="{{ site.baseurl}}/">Documentation<span class="hidden-sm hidden-xs"> {{ site.version_short }}</span></a></li>
+            <li class="hidden-sm {% if page.url == '/' %}active{% endif %}"><a href="{{ site.baseurl}}/">Documentation {{ site.version_short }}</a></li>
 
             <!-- Quickstart -->
             <li class="dropdown{% if page.url contains '/quickstart/' %} active{% endif %}">
               <a href="{{ quickstart }}" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Quickstart <span class="caret"></span></a>
               <ul class="dropdown-menu" role="menu">
-                <li><a href="{{ quickstart }}/setup_quickstart.html">Setup</a></li>
-                <li><a href="{{ quickstart }}/java_api_quickstart.html">Java API</a></li>
-                <li><a href="{{ quickstart }}/scala_api_quickstart.html">Scala API</a></li>
-                <li><a href="{{ quickstart }}/run_example_quickstart.html">Run Step-by-Step Example</a></li>
+                {% assign quickstart_group = (site.pages | where: "top-nav-group" , "quickstart" | sort: "top-nav-pos") %}
+                {% for quickstart_page in quickstart_group %}
+                <li class="{% if page.url contains quickstart_page.url %}active{% endif %}"><a href="{{ site.baseurl }}{{ quickstart_page.url }}">{% if quickstart_page.top-nav-title %}{{ quickstart_page.top-nav-title }}{% else %}{{ quickstart_page.title }}{% endif %}</a></li>
+                {% endfor %}
               </ul>
             </li>
 
@@ -56,19 +56,17 @@
             <li class="dropdown{% if page.url contains '/setup/' %} active{% endif %}">
               <a href="{{ setup }}" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Setup <span class="caret"></span></a>
               <ul class="dropdown-menu" role="menu">
-                <li><a href="{{ setup }}/building.html">Build Flink {{ site.version }}</a></li>
+                {% assign setup_group = (site.pages | where: "top-nav-group" , "setup" | sort: "top-nav-pos") %}
+                {% for setup_group_page in setup_group %}
+                <li class="{% if page.url contains setup_group_page.url %}active{% endif %}"><a href="{{ site.baseurl }}{{ setup_group_page.url }}">{% if setup_group_page.top-nav-title %}{{ setup_group_page.top-nav-title }}{% else %}{{ setup_group_page.title }}{% endif %}</a></li>
+                {% endfor %}
 
                 <li class="divider"></li>
                 <li role="presentation" class="dropdown-header"><strong>Deployment</strong></li>
-                <li><a href="{{ setup }}/local_setup.html" class="active">Local</a></li>
-                <li><a href="{{ setup }}/cluster_setup.html">Cluster (Standalone)</a></li>
-                <li><a href="{{ setup }}/yarn_setup.html">YARN</a></li>
-                <li><a href="{{ setup }}/gce_setup.html">GCloud</a></li>
-                <li><a href="{{ setup }}/flink_on_tez.html">Flink on Tez <span class="badge">Beta</span></a></li>
-                <li><a href="{{ setup }}/jobmanager_high_availability.html">JobManager High Availability</a></li>
-
-                <li class="divider"></li>
-                <li><a href="{{ setup }}/config.html">Configuration</a></li>
+                {% assign deployment_group = (site.pages | where: "top-nav-group" , "deployment" | sort: "top-nav-pos") %}
+                {% for deployment_group_page in deployment_group %}
+                <li class="{% if page.url contains deployment_group_page.url %}active{% endif %}"><a href="{{ site.baseurl }}{{ deployment_group_page.url }}">{% if deployment_group_page.top-nav-title %}{{ deployment_group_page.top-nav-title }}{% else %}{{ deployment_group_page.title }}{% endif %}</a></li>
+                {% endfor %}
               </ul>
             </li>
 
@@ -76,27 +74,10 @@
             <li class="dropdown{% if page.url contains '/apis/' %} active{% endif %}">
               <a href="{{ apis }}" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Programming Guides <span class="caret"></span></a>
               <ul class="dropdown-menu" role="menu">
-                <li><a href="{{ apis }}/programming_guide.html"><strong>DataSet API</strong></a></li>
-                <li><a href="{{ apis }}/streaming_guide.html"><strong>DataStream API</strong></a></li>
-                <li><a href="{{ apis }}/python.html">Python API <span class="badge">Beta</span></a></li>
-
-                <li class="divider"></li>
-                <li><a href="{{ apis }}/fault_tolerance.html">Fault Tolerance</a></li>
-                <li><a href="{{ apis }}/state_backends.html">State in Streaming Programs</a></li>
-                <li><a href="{{ apis }}/savepoints.html">Savepoints</a></li>
-                <li><a href="{{ apis }}/scala_shell.html">Interactive Scala Shell</a></li>
-                <li><a href="{{ apis }}/dataset_transformations.html">DataSet Transformations</a></li>
-                <li><a href="{{ apis }}/best_practices.html">Best Practices</a></li>
-                <li><a href="{{ apis }}/example_connectors.html">Connectors (DataSet API)</a></li>
-                <li><a href="{{ apis }}/examples.html">Examples</a></li>
-                <li><a href="{{ apis }}/local_execution.html">Local Execution</a></li>
-                <li><a href="{{ apis }}/cluster_execution.html">Cluster Execution</a></li>
-                <li><a href="{{ apis }}/cli.html">Command Line Interface</a></li>
-                <li><a href="{{ apis }}/web_client.html">Web Client</a></li>
-                <li><a href="{{ apis }}/iterations.html">Iterations (DataSet API)</a></li>
-                <li><a href="{{ apis }}/java8.html">Java 8</a></li>
-                <li><a href="{{ apis }}/hadoop_compatibility.html">Hadoop Compatibility <span class="badge">Beta</span></a></li>
-                <li><a href="{{ apis }}/storm_compatibility.html">Storm Compatibility <span class="badge">Beta</span></a></li>
+                {% assign apis_group = (site.pages | where: "top-nav-group" , "apis" | sort: "top-nav-pos") %}
+                {% for apis_group_page in apis_group %}
+                <li class="{% if page.url contains apis_group_page.url %}active{% endif %}"><a href="{{ site.baseurl }}{{ apis_group_page.url }}">{% if apis_group_page.top-nav-title %}{{ apis_group_page.top-nav-title }}{% else %}{{ apis_group_page.title }}{% endif %}</a></li>
+                {% endfor %}
               </ul>
             </li>
 
@@ -104,9 +85,10 @@
             <li class="dropdown{% if page.url contains '/libs/' %} active{% endif %}">
               <a href="{{ libs }}" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Libraries <span class="caret"></span></a>
                 <ul class="dropdown-menu" role="menu">
-                  <li><a href="{{ libs }}/gelly_guide.html">Graphs: Gelly</a></li>
-                  <li><a href="{{ libs }}/ml/">Machine Learning <span class="badge">Beta</span></a></li>
-                  <li><a href="{{ libs }}/table.html">Relational: Table <span class="badge">Beta</span></a></li>
+                  {% assign libs_group = (site.pages | where: "top-nav-group" , "libs" | sort: "top-nav-pos") %}
+                  {% for libs_page in libs_group %}
+                  <li class="{% if page.url contains libs_page.url %}active{% endif %}"><a href="{{ site.baseurl }}{{ libs_page.url }}">{% if libs_page.top-nav-title %}{{ libs_page.top-nav-title }}{% else %}{{ libs_page.title }}{% endif %}</a></li>
+                  {% endfor %}
               </ul>
             </li>
 
@@ -117,16 +99,10 @@
                 <li role="presentation" class="dropdown-header"><strong>Contribute</strong></li>
                 <li><a href="http://flink.apache.org/how-to-contribute.html"><small><span class="glyphicon glyphicon-new-window"></span></small> How to Contribute</a></li>
                 <li><a href="http://flink.apache.org/contribute-code.html#coding-guidelines"><small><span class="glyphicon glyphicon-new-window"></span></small> Coding Guidelines</a></li>
-                <li><a href="{{ internals }}/ide_setup.html">IDE Setup</a></li>
-                <li><a href="{{ internals }}/logging.html">Logging</a></li>
-                <li class="divider"></li>
-                <li role="presentation" class="dropdown-header"><strong>Internals</strong></li>
-                <li><a href="{{ internals }}/general_arch.html">Architecture &amp; Process Model</a></li>
-                <li><a href="{{ internals }}/stream_checkpointing.html">Fault Tolerance for Data Streaming</a></li>
-                <li><a href="{{ internals }}/types_serialization.html">Type Extraction &amp; Serialization</a></li>
-                <li><a href="{{ internals }}/monitoring_rest_api.html">Monitoring REST API</a></li>
-                <li><a href="{{ internals }}/job_scheduling.html">Jobs &amp; Scheduling</a></li>
-                <li><a href="{{ internals }}/add_operator.html">How-To: Add an Operator</a></li>
+                {% assign internals_group = (site.pages | where: "top-nav-group" , "internals" | sort: "top-nav-pos") %}
+                {% for internals_page in internals_group %}
+                <li class="{% if page.url contains internals_page.url %}active{% endif %}"><a href="{{ site.baseurl }}{{ internals_page.url }}">{% if internals_page.top-nav-title %}{{ internals_page.top-nav-title }}{% else %}{{ internals_page.title }}{% endif %}</a></li>
+                {% endfor %}
               </ul>
             </li>
           </ul>
diff --git a/docs/_layouts/base.html b/docs/_layouts/base.html
index dfd0f65968780..d4e74efe43d64 100644
--- a/docs/_layouts/base.html
+++ b/docs/_layouts/base.html
@@ -23,11 +23,7 @@
     <meta http-equiv="X-UA-Compatible" content="IE=edge">
     <meta name="viewport" content="width=device-width, initial-scale=1">
     <!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags -->
-    {% if page.htmlTitle %}
-    <title>Apache Flink {{ site.version}} Documentation: {{ page.htmlTitle }}</title>
-    {% else %}
     <title>Apache Flink {{ site.version}} Documentation: {{ page.title }}</title>
-    {% endif %}
     <link rel="shortcut icon" href="{{ site.baseurl }}/page/favicon.ico" type="image/x-icon">
     <link rel="icon" href="{{ site.baseurl }}/page/favicon.ico" type="image/x-icon">
 
diff --git a/docs/_layouts/plain.html b/docs/_layouts/plain.html
index 6dd9305a21276..f06487925a1f3 100644
--- a/docs/_layouts/plain.html
+++ b/docs/_layouts/plain.html
@@ -20,14 +20,92 @@
 under the License.
 -->
 <div class="row">
-  <div class="col-sm-10 col-sm-offset-1">
-    <h1>{{ page.title }}{% if page.is_beta %} <span class="badge">Beta</span>{% endif %}</h1>
+{% if page.sub-nav-group %}
+{% comment %}
+The plain layout with a sub navigation.
 
-{{ content }}
+- This is activated via the 'sub-nav-group' field in the preemble.
+- All pages of this sub nav group will be displayed in the sub navigation:
+  * Each element without a 'sub-nav-parent' field will be displayed on the 1st level, where the position is defined via 'sub-nav-pos'.
+  * If the page should be displayed as a child element, it needs to specify a 'sub-nav-parent' field, which matches the 'sub-nav-id' of its parent. The parent only needs to specify this if it expects child nodes.
+{% endcomment %}
+  <!-- Sub Navigation -->
+  <div class="col-sm-3">
+    <ul id="sub-nav">
+      {% comment %} Get all pages belonging to this group sorted by their position {% endcomment %}
+      {% assign group = (site.pages | where: "sub-nav-group" , page.sub-nav-group | where: "sub-nav-parent" , nil | sort: "sub-nav-pos") %}
+      {% for group_page in group %}
+        {% if group_page.sub-nav-id  %}
+        {% assign sub_group = (site.pages | where: "sub-nav-group" , page.sub-nav-group | where: "sub-nav-parent" , group_page.sub-nav-id | sort: "sub-nav-pos") %}
+        {% else %}
+        {% assign sub_group = nil %}
+        {% endif %}
+        <li><a href="{{ site.baseurl }}{{ group_page.url }}" class="{% if page.url contains group_page.url %}active{% endif %}">{% if group_page.sub-nav-title %}{{ group_page.sub-nav-title }}{% else %}{{ group_page.title }}{% endif %}</a>
+          {% if sub_group and sub_group.size() > 0 %}
+          <ul>
+            {% for sub_group_page in sub_group %}
+              <li><a href="{{ site.baseurl }}{{ sub_group_page.url }}" class="{% if page.url contains sub_group_page.url or (sub_group_page.sub-nav-id and page.sub-nav-parent and sub_group_page.sub-nav-id == page.sub-nav-parent) %}active{% endif %}">{% if sub_group_page.sub-nav-title %}{{ sub_group_page.sub-nav-title }}{% else %}{{ sub_group_page.title }}{% endif %}</a></li>
+            {% endfor %}
+          </ul>
+          {% endif %}
+        </li>
+      {% endfor %}
+    </ul>
   </div>
+  <!-- Main -->
+  <div class="col-sm-9">
+    <!-- Top anchor -->
+    <a href="#top"></a>
+
+    <!-- Breadcrumbs above the main heading -->
+    <ol class="breadcrumb">
+      {% for group_page in group %}
+      {% if group_page.sub-nav-group-title %}
+      <li><a href="{{ site.baseurl }}{{ group_page.url }}">{{ group_page.sub-nav-group-title }}</a></li>
+      {% endif %}
+      {% endfor %}
+
+      {% if page.sub-nav-parent %}
+      {% assign parent = (site.pages | where: "sub-nav-group" , page.sub-nav-group | where: "sub-nav-id" , page.sub-nav-parent | first) %}
+      {% if parent %}
+
+      {% if parent.sub-nav-parent %}
+      {% assign grandparent = (site.pages | where: "sub-nav-group" , page.sub-nav-group | where: "sub-nav-id" , parent.sub-nav-parent | first) %}
+
+      {% if grandparent %}
+      <li><a href="{{ site.baseurl }}{{ grandparent.url }}">{% if grandparent.sub-nav-title %}{{ grandparent.sub-nav-title }}{% else %}{{ grandparent.title }}{% endif %}</a></li>
+      {% endif %}
+
+      {% endif %}
 
-  <div class="col-sm-10 col-sm-offset-1">
+      <li><a href="{{ site.baseurl }}{{ parent.url }}">{% if parent.sub-nav-title %}{{ parent.sub-nav-title }}{% else %}{{ parent.title }}{% endif %}</a></li>
+      {% endif %}
+      {% endif %}
+      <li class="active">{% if page.sub-nav-title %}{{ page.sub-nav-title }}{% else %}{{ page.title }}{% endif %}</li>
+    </ol>
+
+    <div class="text">
+      <!-- Main heading -->
+      <h1>{{ page.title }}{% if page.is_beta %} <span class="beta">(Beta)</span>{% endif %}</h1>
+
+      <!-- Content -->
+      {{ content }}
+    </div>
+  </div>
+{% else %}
+{% comment %}
+The plain layout without a sub navigation (only text).
+{% endcomment %}
+  <div class="col-md-8 col-md-offset-2 text">
+    <h1>{{ page.title }}{% if page.is_beta %} <span class="badge">Beta</span>{% endif %}</h1>
+{{ content }}
+  </div>
+{% endif %}
+  {% comment %}
+  Removed until Robert complains... ;)
+  <div class="col-sm-8 col-sm-offset-2">
     <!-- Disqus thread and some vertical offset -->
     <div style="margin-top: 75px; margin-bottom: 50px" id="disqus_thread"></div>
   </div>
+  {% endcomment %}
 </div>
diff --git a/docs/_plugins/highlightCode.rb b/docs/_plugins/highlightCode.rb
index 5da8b07568b1d..74f6d6f0ff144 100644
--- a/docs/_plugins/highlightCode.rb
+++ b/docs/_plugins/highlightCode.rb
@@ -91,7 +91,8 @@ def convert(content)
         :toc_levels           => @config['kramdown']['toc_levels'],
         :smart_quotes         => @config['kramdown']['smart_quotes'],
         :coderay_default_lang => @config['kramdown']['default_lang'],
-        :input                => @config['kramdown']['input']
+        :input                => @config['kramdown']['input'],
+        :hard_wrap            => @config['kramdown']['hard_wrap']
     }).to_pygs
     return html
   end
diff --git a/docs/_plugins/info.rb b/docs/_plugins/info.rb
new file mode 100644
index 0000000000000..de3238dcf14ab
--- /dev/null
+++ b/docs/_plugins/info.rb
@@ -0,0 +1,20 @@
+module Jekyll
+  class InfoTag < Liquid::Tag
+
+    def initialize(tag_name, text, tokens)
+      super
+      @text = text
+    end
+
+    def render(context)
+    	if @text.to_s == ''
+    		@text = "Info"
+    	end
+
+    	@text = @text.strip! || @text if !@text.nil?
+    	"<span class=\"label label-info\">#{@text}</span>"
+    end
+  end
+end
+
+Liquid::Template.register_tag('info', Jekyll::InfoTag)
diff --git a/docs/_plugins/top.rb b/docs/_plugins/top.rb
new file mode 100644
index 0000000000000..da7846e3e857f
--- /dev/null
+++ b/docs/_plugins/top.rb
@@ -0,0 +1,14 @@
+module Jekyll
+  class TopTag < Liquid::Tag
+
+    def initialize(tag_name, text, tokens)
+      super
+    end
+
+    def render(context)
+    	"<a href=\"\#top\" class=\"top pull-right\"><span class=\"glyphicon glyphicon-chevron-up\"></span> Back to top</a>"
+    end
+  end
+end
+
+Liquid::Template.register_tag('top', Jekyll::TopTag)
diff --git a/docs/_plugins/warn.rb b/docs/_plugins/warn.rb
new file mode 100644
index 0000000000000..b9eaad498c6bf
--- /dev/null
+++ b/docs/_plugins/warn.rb
@@ -0,0 +1,20 @@
+module Jekyll
+  class WarnTag < Liquid::Tag
+
+    def initialize(tag_name, text, tokens)
+      super
+      @text = text
+    end
+
+    def render(context)
+    	if @text.to_s == ''
+    		@text = "Warning"
+    	end
+
+    	@text = @text.strip! || @text if !@text.nil?
+    	"<span class=\"label label-danger\">#{@text}</span>"
+    end
+  end
+end
+
+Liquid::Template.register_tag('warn', Jekyll::WarnTag)
diff --git a/docs/apis/dataset_transformations.md b/docs/apis/batch/dataset_transformations.md
similarity index 99%
rename from docs/apis/dataset_transformations.md
rename to docs/apis/batch/dataset_transformations.md
index 89521702b7d48..b9d7f0c3929da 100644
--- a/docs/apis/dataset_transformations.md
+++ b/docs/apis/batch/dataset_transformations.md
@@ -1,5 +1,11 @@
 ---
 title: "DataSet Transformations"
+
+# Sub-level navigation
+sub-nav-group: batch
+sub-nav-parent: dataset_api
+sub-nav-pos: 1
+sub-nav-title: Transformations
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -930,7 +936,7 @@ The following code removes all duplicate elements from the DataSet:
 ~~~java
 DataSet<Tuple2<Integer, Double>> input = // [...]
 DataSet<Tuple2<Integer, Double>> output = input.distinct();
-                                     
+
 ~~~
 
 </div>
@@ -966,7 +972,7 @@ It is also possible to change how the distinction of the elements in the DataSet
 ~~~java
 DataSet<Tuple2<Integer, Double, String>> input = // [...]
 DataSet<Tuple2<Integer, Double, String>> output = input.distinct(0,2);
-                                     
+
 ~~~
 
 </div>
@@ -1003,7 +1009,7 @@ private static final long serialVersionUID = 1L;
 }
 DataSet<Integer> input = // [...]
 DataSet<Integer> output = input.distinct(new AbsSelector());
-                                     
+
 ~~~
 
 </div>
@@ -1040,7 +1046,7 @@ public class CustomType {
 
 DataSet<CustomType> input = // [...]
 DataSet<CustomType> output = input.distinct("aName", "aNumber");
-                                     
+
 ~~~
 
 </div>
@@ -1073,7 +1079,7 @@ It is also possible to indicate to use all the fields by the wildcard character:
 ~~~java
 DataSet<CustomType> input = // [...]
 DataSet<CustomType> output = input.distinct("*");
-                                     
+
 ~~~
 
 </div>
@@ -1212,10 +1218,10 @@ val weightedRatings = ratings.join(weights).where("category").equalTo(0) {
 ~~~python
  class PointWeighter(JoinFunction):
    def join(self, rating, weight):
-     return (rating[0], rating[1] * weight[1]) 
+     return (rating[0], rating[1] * weight[1])
        if value1[3]:
 
- weightedRatings = 
+ weightedRatings =
    ratings.join(weights).where(0).equal_to(0). \
    with(new PointWeighter(), (STRING, FLOAT));
 ~~~
@@ -1294,7 +1300,7 @@ val weightedRatings = ratings.join(weights).where("category").equalTo(0) {
 A Join transformation can construct result tuples using a projection as shown here:
 
 ~~~python
- result = input1.join(input2).where(0).equal_to(0) \ 
+ result = input1.join(input2).where(0).equal_to(0) \
   .project_first(0,2).project_second(1).project_first(1);
 ~~~
 
@@ -1429,7 +1435,7 @@ The following hints are available:
 
 ### OuterJoin
 
-The OuterJoin transformation performs a left, right, or full outer join on two data sets. Outer joins are similar to regular (inner) joins and create all pairs of elements that are equal on their keys. In addition, records of the "outer" side (left, right, or both in case of full) are preserved if no matching key is found in the other side. Matching pair of elements (or one element and a `null` value for the other input) are given to a `JoinFunction` to turn the pair of elements into a single element, or to a `FlatJoinFunction` to turn the pair of elements into arbitararily many (including none) elements. 
+The OuterJoin transformation performs a left, right, or full outer join on two data sets. Outer joins are similar to regular (inner) joins and create all pairs of elements that are equal on their keys. In addition, records of the "outer" side (left, right, or both in case of full) are preserved if no matching key is found in the other side. Matching pair of elements (or one element and a `null` value for the other input) are given to a `JoinFunction` to turn the pair of elements into a single element, or to a `FlatJoinFunction` to turn the pair of elements into arbitararily many (including none) elements.
 
 The elements of both DataSets are joined on one or more keys which can be specified using
 
@@ -1599,7 +1605,7 @@ The following hints are available.
 * `REPARTITION_HASH_FIRST`: The system partitions (shuffles) each input (unless the input is already
   partitioned) and builds a hash table from the first input. This strategy is good if the first
   input is smaller than the second, but both inputs are still large.
-  
+
 * `REPARTITION_HASH_SECOND`: The system partitions (shuffles) each input (unless the input is already
   partitioned) and builds a hash table from the second input. This strategy is good if the second
   input is smaller than the first, but both inputs are still large.
diff --git a/docs/apis/examples.md b/docs/apis/batch/examples.md
similarity index 96%
rename from docs/apis/examples.md
rename to docs/apis/batch/examples.md
index d22b436dbe593..e982b21f19754 100644
--- a/docs/apis/examples.md
+++ b/docs/apis/batch/examples.md
@@ -1,5 +1,10 @@
 ---
 title:  "Bundled Examples"
+
+# Sub-level navigation
+sub-nav-group: batch
+sub-nav-pos: 5
+sub-nav-title: Examples
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -20,9 +25,9 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-The following example programs showcase different applications of Flink 
-from simple word counting to graph algorithms. The code samples illustrate the 
-use of [Flink's API](programming_guide.html). 
+The following example programs showcase different applications of Flink
+from simple word counting to graph algorithms. The code samples illustrate the
+use of [Flink's API](index.html).
 
 The full source code of the following and more examples can be found in the __flink-java-examples__
 or __flink-scala-examples__ module of the Flink source repository.
@@ -65,9 +70,9 @@ WordCount is the "Hello World" of Big Data processing systems. It computes the f
 ~~~java
 ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
 
-DataSet<String> text = env.readTextFile("/path/to/file"); 
+DataSet<String> text = env.readTextFile("/path/to/file");
 
-DataSet<Tuple2<String, Integer>> counts = 
+DataSet<Tuple2<String, Integer>> counts =
         // split up the lines in pairs (2-tuples) containing: (word,1)
         text.flatMap(new Tokenizer())
         // group by the tuple field "0" and sum up tuple field "1"
@@ -83,7 +88,7 @@ public static class Tokenizer implements FlatMapFunction<String, Tuple2<String,
     public void flatMap(String value, Collector<Tuple2<String, Integer>> out) {
         // normalize and split the line
         String[] tokens = value.toLowerCase().split("\\W+");
-        
+
         // emit the pairs
         for (String token : tokens) {
             if (token.length() > 0) {
@@ -150,7 +155,7 @@ DataSet<Tuple2<Long, Double>> newRanks = iteration
         .map(new Dampener(DAMPENING_FACTOR, numPages));
 
 DataSet<Tuple2<Long, Double>> finalPageRanks = iteration.closeWith(
-        newRanks, 
+        newRanks,
         newRanks.join(iteration).where(0).equalTo(0)
         // termination condition
         .filter(new EpsilonFilter()));
@@ -159,17 +164,17 @@ finalPageRanks.writeAsCsv(outputPath, "\n", " ");
 
 // User-defined functions
 
-public static final class JoinVertexWithEdgesMatch 
-                    implements FlatJoinFunction<Tuple2<Long, Double>, Tuple2<Long, Long[]>, 
+public static final class JoinVertexWithEdgesMatch
+                    implements FlatJoinFunction<Tuple2<Long, Double>, Tuple2<Long, Long[]>,
                                             Tuple2<Long, Double>> {
 
     @Override
-    public void join(<Tuple2<Long, Double> page, Tuple2<Long, Long[]> adj, 
+    public void join(<Tuple2<Long, Double> page, Tuple2<Long, Long[]> adj,
                         Collector<Tuple2<Long, Double>> out) {
         Long[] neigbors = adj.f1;
         double rank = page.f1;
         double rankToDistribute = rank / ((double) neigbors.length);
-            
+
         for (int i = 0; i < neigbors.length; i++) {
             out.collect(new Tuple2<Long, Double>(neigbors[i], rankToDistribute));
         }
@@ -191,7 +196,7 @@ public static final class Dampener implements MapFunction<Tuple2<Long,Double>, T
     }
 }
 
-public static final class EpsilonFilter 
+public static final class EpsilonFilter
                 implements FilterFunction<Tuple2<Tuple2<Long, Double>, Tuple2<Long, Double>>> {
 
     @Override
@@ -297,12 +302,12 @@ DataSet<Tuple2<Long, Long>> edges = getEdgeDataSet(env).flatMap(new UndirectEdge
 
 // assign the initial component IDs (equal to the vertex ID)
 DataSet<Tuple2<Long, Long>> verticesWithInitialId = vertices.map(new DuplicateValue<Long>());
-        
+
 // open a delta iteration
 DeltaIteration<Tuple2<Long, Long>, Tuple2<Long, Long>> iteration =
         verticesWithInitialId.iterateDelta(verticesWithInitialId, maxIterations, 0);
 
-// apply the step logic: 
+// apply the step logic:
 DataSet<Tuple2<Long, Long>> changes = iteration.getWorkset()
         // join with the edges
         .join(edges).where(0).equalTo(0).with(new NeighborWithComponentIDJoin())
@@ -321,17 +326,17 @@ result.writeAsCsv(outputPath, "\n", " ");
 // User-defined functions
 
 public static final class DuplicateValue<T> implements MapFunction<T, Tuple2<T, T>> {
-    
+
     @Override
     public Tuple2<T, T> map(T vertex) {
         return new Tuple2<T, T>(vertex, vertex);
     }
 }
 
-public static final class UndirectEdge 
+public static final class UndirectEdge
                     implements FlatMapFunction<Tuple2<Long, Long>, Tuple2<Long, Long>> {
     Tuple2<Long, Long> invertedEdge = new Tuple2<Long, Long>();
-    
+
     @Override
     public void flatMap(Tuple2<Long, Long> edge, Collector<Tuple2<Long, Long>> out) {
         invertedEdge.f0 = edge.f1;
@@ -341,7 +346,7 @@ public static final class UndirectEdge
     }
 }
 
-public static final class NeighborWithComponentIDJoin 
+public static final class NeighborWithComponentIDJoin
                 implements JoinFunction<Tuple2<Long, Long>, Tuple2<Long, Long>, Tuple2<Long, Long>> {
 
     @Override
@@ -350,12 +355,12 @@ public static final class NeighborWithComponentIDJoin
     }
 }
 
-public static final class ComponentIdFilter 
-                    implements FlatMapFunction<Tuple2<Tuple2<Long, Long>, Tuple2<Long, Long>>, 
+public static final class ComponentIdFilter
+                    implements FlatMapFunction<Tuple2<Tuple2<Long, Long>, Tuple2<Long, Long>>,
                                             Tuple2<Long, Long>> {
 
     @Override
-    public void flatMap(Tuple2<Tuple2<Long, Long>, Tuple2<Long, Long>> value, 
+    public void flatMap(Tuple2<Tuple2<Long, Long>, Tuple2<Long, Long>> value,
                         Collector<Tuple2<Long, Long>> out) {
         if (value.f0.f1 < value.f1.f1) {
             out.collect(value.f0);
@@ -404,7 +409,7 @@ val verticesWithComponents = vertices.iterateDelta(vertices, maxIterations, Arra
 }
 
 verticesWithComponents.writeAsCsv(outputPath, "\n", " ")
-    
+
 ~~~
 
 The {% gh_link /flink-examples/flink-scala-examples/src/main/scala/org/apache/flink/examples/scala/graph/ConnectedComponents.scala "ConnectedComponents program" %} implements the above example. It requires the following parameters to run: `<vertex input path>, <edge input path>, <output path> <max num iterations>`.
@@ -427,7 +432,7 @@ The example implements the following SQL query.
 SELECT l_orderkey, o_shippriority, sum(l_extendedprice) as revenue
     FROM orders, lineitem
 WHERE l_orderkey = o_orderkey
-    AND o_orderstatus = "F" 
+    AND o_orderstatus = "F"
     AND YEAR(o_orderdate) > 1993
     AND o_orderpriority LIKE "5%"
 GROUP BY l_orderkey, o_shippriority;
@@ -468,14 +473,14 @@ DataSet<Tuple2<Integer, Integer>> ordersFilteredByYear =
         .project(0,4).types(Integer.class, Integer.class);
 
 // join orders with lineitems: (orderkey, shippriority, extendedprice)
-DataSet<Tuple3<Integer, Integer, Double>> lineitemsOfOrders = 
+DataSet<Tuple3<Integer, Integer, Double>> lineitemsOfOrders =
         ordersFilteredByYear.joinWithHuge(lineitems)
                             .where(0).equalTo(0)
                             .projectFirst(0,1).projectSecond(1)
                             .types(Integer.class, Integer.class, Double.class);
 
 // extendedprice sums: (orderkey, shippriority, sum(extendedprice))
-DataSet<Tuple3<Integer, Integer, Double>> priceSums = 
+DataSet<Tuple3<Integer, Integer, Double>> priceSums =
         // group by order and sum extendedprice
         lineitemsOfOrders.groupBy(0,1).aggregate(Aggregations.SUM, 2);
 
@@ -494,7 +499,7 @@ The {% gh_link /flink-examples/flink-scala-examples/src/main/scala/org/apache/fl
 </div>
 </div>
 
-The orders and lineitem files can be generated using the [TPC-H benchmark](http://www.tpc.org/tpch/) suite's data generator tool (DBGEN). 
+The orders and lineitem files can be generated using the [TPC-H benchmark](http://www.tpc.org/tpch/) suite's data generator tool (DBGEN).
 Take the following steps to generate arbitrary large input files for the provided Flink programs:
 
 1.  Download and unpack DBGEN
diff --git a/docs/apis/batch/fault_tolerance.md b/docs/apis/batch/fault_tolerance.md
new file mode 100644
index 0000000000000..51a6b4134164a
--- /dev/null
+++ b/docs/apis/batch/fault_tolerance.md
@@ -0,0 +1,100 @@
+---
+title: "Fault Tolerance"
+
+# Sub-level navigation
+sub-nav-group: batch
+sub-nav-pos: 2
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+Flink's fault tolerance mechanism recovers programs in the presence of failures and
+continues to execute them. Such failures include machine hardware failures, network failures,
+transient program failures, etc.
+
+* This will be replaced by the TOC
+{:toc}
+
+Batch Processing Fault Tolerance (DataSet API)
+----------------------------------------------
+
+Fault tolerance for programs in the *DataSet API* works by retrying failed executions.
+The number of time that Flink retries the execution before the job is declared as failed is configurable
+via the *execution retries* parameter. A value of *0* effectively means that fault tolerance is deactivated.
+
+To activate the fault tolerance, set the *execution retries* to a value larger than zero. A common choice is a value
+of three.
+
+This example shows how to configure the execution retries for a Flink DataSet program.
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
+env.setNumberOfExecutionRetries(3);
+{% endhighlight %}
+</div>
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val env = ExecutionEnvironment.getExecutionEnvironment()
+env.setNumberOfExecutionRetries(3)
+{% endhighlight %}
+</div>
+</div>
+
+
+You can also define default values for the number of execution retries and the retry delay in the `flink-conf.yaml`:
+
+~~~
+execution-retries.default: 3
+~~~
+
+
+Retry Delays
+------------
+
+Execution retries can be configured to be delayed. Delaying the retry means that after a failed execution, the re-execution does not start
+immediately, but only after a certain delay.
+
+Delaying the retries can be helpful when the program interacts with external systems where for example connections or pending transactions should reach a timeout before re-execution is attempted.
+
+You can set the retry delay for each program as follows (the sample shows the DataStream API - the DataSet API works similarly):
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
+env.getConfig().setExecutionRetryDelay(5000); // 5000 milliseconds delay
+{% endhighlight %}
+</div>
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val env = StreamExecutionEnvironment.getExecutionEnvironment()
+env.getConfig.setExecutionRetryDelay(5000) // 5000 milliseconds delay
+{% endhighlight %}
+</div>
+</div>
+
+You can also define the default value for the retry delay in the `flink-conf.yaml`:
+
+~~~
+execution-retries.delay: 10 s
+~~~
+
+{% top %}
diff --git a/docs/apis/fig/LICENSE.txt b/docs/apis/batch/fig/LICENSE.txt
similarity index 100%
rename from docs/apis/fig/LICENSE.txt
rename to docs/apis/batch/fig/LICENSE.txt
diff --git a/docs/apis/fig/iterations_delta_iterate_operator.png b/docs/apis/batch/fig/iterations_delta_iterate_operator.png
similarity index 100%
rename from docs/apis/fig/iterations_delta_iterate_operator.png
rename to docs/apis/batch/fig/iterations_delta_iterate_operator.png
diff --git a/docs/apis/fig/iterations_delta_iterate_operator_example.png b/docs/apis/batch/fig/iterations_delta_iterate_operator_example.png
similarity index 100%
rename from docs/apis/fig/iterations_delta_iterate_operator_example.png
rename to docs/apis/batch/fig/iterations_delta_iterate_operator_example.png
diff --git a/docs/apis/fig/iterations_iterate_operator.png b/docs/apis/batch/fig/iterations_iterate_operator.png
similarity index 100%
rename from docs/apis/fig/iterations_iterate_operator.png
rename to docs/apis/batch/fig/iterations_iterate_operator.png
diff --git a/docs/apis/fig/iterations_iterate_operator_example.png b/docs/apis/batch/fig/iterations_iterate_operator_example.png
similarity index 100%
rename from docs/apis/fig/iterations_iterate_operator_example.png
rename to docs/apis/batch/fig/iterations_iterate_operator_example.png
diff --git a/docs/apis/fig/iterations_supersteps.png b/docs/apis/batch/fig/iterations_supersteps.png
similarity index 100%
rename from docs/apis/fig/iterations_supersteps.png
rename to docs/apis/batch/fig/iterations_supersteps.png
diff --git a/docs/apis/fig/plan_visualizer.png b/docs/apis/batch/fig/plan_visualizer.png
similarity index 100%
rename from docs/apis/fig/plan_visualizer.png
rename to docs/apis/batch/fig/plan_visualizer.png
diff --git a/docs/apis/hadoop_compatibility.md b/docs/apis/batch/hadoop_compatibility.md
similarity index 95%
rename from docs/apis/hadoop_compatibility.md
rename to docs/apis/batch/hadoop_compatibility.md
index d88dc0b74b223..68a6b055f7181 100644
--- a/docs/apis/hadoop_compatibility.md
+++ b/docs/apis/batch/hadoop_compatibility.md
@@ -1,6 +1,9 @@
 ---
 title: "Hadoop Compatibility"
 is_beta: true
+# Sub-level navigation
+sub-nav-group: batch
+sub-nav-pos: 7
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -33,7 +36,7 @@ You can:
 - use a Hadoop `Reducer` as [GroupReduceFunction](dataset_transformations.html#groupreduce-on-grouped-dataset).
 
 This document shows how to use existing Hadoop MapReduce code with Flink. Please refer to the
-[Connecting to other systems](example_connectors.html) guide for reading from Hadoop supported file systems.
+[Connecting to other systems]({{ site.baseurl }}/apis/connectors.html) guide for reading from Hadoop supported file systems.
 
 * This will be replaced by the TOC
 {:toc}
@@ -101,7 +104,7 @@ DataSet<Tuple2<LongWritable, Text>> input =
 
 ~~~scala
 val env = ExecutionEnvironment.getExecutionEnvironment
-		
+
 val input: DataSet[(LongWritable, Text)] =
   env.readHadoopFile(new TextInputFormat, classOf[LongWritable], classOf[Text], textPath)
 
@@ -129,9 +132,9 @@ The following example shows how to use Hadoop's `TextOutputFormat`.
 ~~~java
 // Obtain the result we want to emit
 DataSet<Tuple2<Text, IntWritable>> hadoopResult = [...]
-		
+
 // Set up the Hadoop TextOutputFormat.
-HadoopOutputFormat<Text, IntWritable> hadoopOF = 
+HadoopOutputFormat<Text, IntWritable> hadoopOF =
   // create the Flink wrapper.
   new HadoopOutputFormat<Text, IntWritable>(
     // set the Hadoop OutputFormat and specify the job.
@@ -139,7 +142,7 @@ HadoopOutputFormat<Text, IntWritable> hadoopOF =
   );
 hadoopOF.getConfiguration().set("mapreduce.output.textoutputformat.separator", " ");
 TextOutputFormat.setOutputPath(job, new Path(outputPath));
-		
+
 // Emit data using the Hadoop TextOutputFormat.
 hadoopResult.output(hadoopOF);
 ~~~
@@ -160,7 +163,7 @@ FileOutputFormat.setOutputPath(hadoopOF.getJobConf, new Path(resultPath))
 
 hadoopResult.output(hadoopOF)
 
-		
+
 ~~~
 
 </div>
@@ -173,7 +176,7 @@ Hadoop Mappers are semantically equivalent to Flink's [FlatMapFunctions](dataset
 
 The wrappers take a `DataSet<Tuple2<KEYIN,VALUEIN>>` as input and produce a `DataSet<Tuple2<KEYOUT,VALUEOUT>>` as output where `KEYIN` and `KEYOUT` are the keys and `VALUEIN` and `VALUEOUT` are the values of the Hadoop key-value pairs that are processed by the Hadoop functions. For Reducers, Flink offers a wrapper for a GroupReduceFunction with (`HadoopReduceCombineFunction`) and without a Combiner (`HadoopReduceFunction`). The wrappers accept an optional `JobConf` object to configure the Hadoop Mapper or Reducer.
 
-Flink's function wrappers are 
+Flink's function wrappers are
 
 - `org.apache.flink.hadoopcompatibility.mapred.HadoopMapFunction`,
 - `org.apache.flink.hadoopcompatibility.mapred.HadoopReduceFunction`, and
@@ -199,7 +202,7 @@ DataSet<Tuple2<Text, LongWritable>> result = text
   ));
 ~~~
 
-**Please note:** The Reducer wrapper works on groups as defined by Flink's [groupBy()](dataset_transformations.html#transformations-on-grouped-dataset) operation. It does not consider any custom partitioners, sort or grouping comparators you might have set in the `JobConf`. 
+**Please note:** The Reducer wrapper works on groups as defined by Flink's [groupBy()](dataset_transformations.html#transformations-on-grouped-dataset) operation. It does not consider any custom partitioners, sort or grouping comparators you might have set in the `JobConf`.
 
 ### Complete Hadoop WordCount Example
 
@@ -207,15 +210,15 @@ The following example shows a complete WordCount implementation using Hadoop dat
 
 ~~~java
 ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
-		
+
 // Set up the Hadoop TextInputFormat.
 Job job = Job.getInstance();
-HadoopInputFormat<LongWritable, Text> hadoopIF = 
+HadoopInputFormat<LongWritable, Text> hadoopIF =
   new HadoopInputFormat<LongWritable, Text>(
     new TextInputFormat(), LongWritable.class, Text.class, job
   );
 TextInputFormat.addInputPath(job, new Path(inputPath));
-		
+
 // Read data using the Hadoop TextInputFormat.
 DataSet<Tuple2<LongWritable, Text>> text = env.createInput(hadoopIF);
 
@@ -231,13 +234,13 @@ DataSet<Tuple2<Text, LongWritable>> result = text
   ));
 
 // Set up the Hadoop TextOutputFormat.
-HadoopOutputFormat<Text, IntWritable> hadoopOF = 
+HadoopOutputFormat<Text, IntWritable> hadoopOF =
   new HadoopOutputFormat<Text, IntWritable>(
     new TextOutputFormat<Text, IntWritable>(), job
   );
 hadoopOF.getConfiguration().set("mapreduce.output.textoutputformat.separator", " ");
 TextOutputFormat.setOutputPath(job, new Path(outputPath));
-		
+
 // Emit data using the Hadoop TextOutputFormat.
 result.output(hadoopOF);
 
diff --git a/docs/apis/programming_guide.md b/docs/apis/batch/index.md
similarity index 97%
rename from docs/apis/programming_guide.md
rename to docs/apis/batch/index.md
index 002e8bc0ca4e8..2c4f429c3a808 100644
--- a/docs/apis/programming_guide.md
+++ b/docs/apis/batch/index.md
@@ -1,5 +1,17 @@
 ---
 title: "Flink DataSet API Programming Guide"
+
+# Top-level navigation
+top-nav-group: apis
+top-nav-pos: 2
+top-nav-title: <strong>Batch Guide</strong> (DataSet API)
+
+# Sub-level navigation
+sub-nav-group: batch
+sub-nav-group-title: Batch Guide
+sub-nav-id: dataset_api
+sub-nav-pos: 1
+sub-nav-title: DataSet API
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -20,8 +32,6 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-<a href="#top"></a>
-
 DataSet programs in Flink are regular programs that implement transformations on data sets
 (e.g., filtering, mapping, joining, grouping). The data sets are initially created from certain
 sources (e.g., by reading files, or from local collections). Results are returned via sinks, which may for
@@ -103,7 +113,7 @@ object WordCount {
 
 </div>
 
-[Back to top](#top)
+{% top %}
 
 
 Linking with Flink
@@ -197,7 +207,7 @@ to run your program on Flink with Scala 2.11, you need to add a `_2.11` suffix t
 values of the Flink modules in your dependencies section.
 
 If you are looking for building Flink with Scala 2.11, please check
-[build guide](../setup/building.html#build-flink-for-a-specific-scala-version).
+[build guide]({{ site.baseurl }}/setup/building.html#scala-versions).
 
 #### Hadoop Dependency Versions
 
@@ -211,9 +221,9 @@ In order to link against the latest SNAPSHOT versions of the code, please follow
 
 The *flink-clients* dependency is only necessary to invoke the Flink program locally (for example to
 run it standalone for testing and debugging).  If you intend to only export the program as a JAR
-file and [run it on a cluster](cluster_execution.html), you can skip that dependency.
+file and [run it on a cluster]({{ site.baseurl }}/apis/cluster_execution.html), you can skip that dependency.
 
-[Back to top](#top)
+{% top %}
 
 Program Skeleton
 ----------------
@@ -253,8 +263,8 @@ Typically, you only need to use `getExecutionEnvironment()`, since this
 will do the right thing depending on the context: if you are executing
 your program inside an IDE or as a regular Java program it will create
 a local environment that will execute your program on your local machine. If
-you created a JAR file from your program, and invoke it through the [command line](cli.html)
-or the [web interface](web_client.html),
+you created a JAR file from your program, and invoke it through the [command line]({{ site.baseurl }}/apis/cli.html)
+or the [web interface]({{ site.baseurl }}/apis/web_client.html),
 the Flink cluster manager will execute your main method and `getExecutionEnvironment()` will return
 an execution environment for executing your program on a cluster.
 
@@ -294,7 +304,7 @@ This will create a new DataSet by converting every String in the original
 set to an Integer. For more information and a list of all the transformations,
 please refer to [Transformations](#transformations).
 
-Once you have a DataSet containing your final results, you can either write the result 
+Once you have a DataSet containing your final results, you can either write the result
 to a file system (HDFS or local) or print it.
 
 {% highlight java %}
@@ -321,7 +331,7 @@ programs with a `main()` method. Each program consists of the same basic parts:
 5. Trigger the program execution
 
 We will now give an overview of each of those steps, please refer to the respective sections for
-more details. Note that all core classes of the Scala API are found in the package 
+more details. Note that all core classes of the Scala API are found in the package
 {% gh_link /flink-scala/src/main/scala/org/apache/flink/api/scala "org.apache.flink.api.scala" %}.
 
 
@@ -405,16 +415,16 @@ def collect()
 </div>
 
 
-The first two methods (`writeAsText()` and `writeAsCsv()`) do as the name suggests, the third one 
-can be used to specify a custom data output format. Please refer to [Data Sinks](#data-sinks) for 
+The first two methods (`writeAsText()` and `writeAsCsv()`) do as the name suggests, the third one
+can be used to specify a custom data output format. Please refer to [Data Sinks](#data-sinks) for
 more information on writing to files and also about custom data output formats.
 
-The `print()` method is useful for developing/debugging. It will output the contents of the DataSet 
+The `print()` method is useful for developing/debugging. It will output the contents of the DataSet
 to standard output (on the JVM starting the Flink execution). **NOTE** The behavior of the `print()`
-method changed with Flink 0.9.x. Before it was printing to the log file of the workers, now its 
+method changed with Flink 0.9.x. Before it was printing to the log file of the workers, now its
 sending the DataSet results to the client and printing the results there.
 
-`collect()` retrieve the DataSet from the cluster to the local JVM. The `collect()` method 
+`collect()` retrieve the DataSet from the cluster to the local JVM. The `collect()` method
 will return a `List` containing the elements.
 
 Both `print()` and `collect()` will trigger the execution of the program. You don't need to further call `execute()`.
@@ -424,14 +434,14 @@ Both `print()` and `collect()` will trigger the execution of the program. You do
 the data sizes you can retrieve with `collect()` are limited due to our RPC system. It is not advised
 to collect DataSets larger than 10MBs.
 
-There is also a `printOnTaskManager()` method which will print the DataSet contents on the TaskManager 
+There is also a `printOnTaskManager()` method which will print the DataSet contents on the TaskManager
 (so you have to get them from the log file). The `printOnTaskManager()` method will not trigger a
 program execution.
 
 Once you specified the complete program you need to **trigger the program execution**. You can call
 `execute()` directly on the `ExecutionEnviroment` or you implicitly trigger the execution with
 `collect()` or `print()`.
-Depending on the type of the `ExecutionEnvironment` the execution will be triggered on your local 
+Depending on the type of the `ExecutionEnvironment` the execution will be triggered on your local
 machine or submit your program for execution on a cluster.
 
 Note that you can not call both `print()` (or `collect()`) and `execute()` at the end of program.
@@ -441,7 +451,7 @@ accumulator results. `print()` and `collect()` are not returning the result, but
 accessed from the `getLastJobExecutionResult()` method.
 
 
-[Back to top](#top)
+{% top %}
 
 
 DataSet abstraction
@@ -450,13 +460,13 @@ DataSet abstraction
 A `DataSet` is an abstract representation of a finite immutable collection of data of the same type that may contain duplicates.
 
 Note that Flink is not always physically creating (materializing) each DataSet at runtime. This
-depends on the used runtime, the configuration and optimizer decisions. DataSets may be "streamed through" 
+depends on the used runtime, the configuration and optimizer decisions. DataSets may be "streamed through"
 operations during execution, as under the hood Flink uses a streaming data processing engine.
 
 Some DataSets are materialized automatically to avoid distributed deadlocks (at points where the data flow graph branches
 out and joins again later) or if the execution mode has explicitly been set to blocking execution.
 
-[Back to top](#top)
+{% top %}
 
 
 Lazy Evaluation
@@ -464,7 +474,7 @@ Lazy Evaluation
 
 All Flink DataSet programs are executed lazily: When the program's main method is executed, the data loading
 and transformations do not happen directly. Rather, each operation is created and added to the
-program's plan. The operations are actually executed when the execution is explicitly triggered by 
+program's plan. The operations are actually executed when the execution is explicitly triggered by
 an `execute()` call on the ExecutionEnvironment object. Also, `collect()` and `print()` will trigger
 the job execution. Whether the program is executed locally or on a cluster depends
 on the environment of the program.
@@ -472,7 +482,7 @@ on the environment of the program.
 The lazy evaluation lets you construct sophisticated programs that Flink executes as one
 holistically planned unit.
 
-[Back to top](#top)
+{% top %}
 
 
 Transformations
@@ -552,7 +562,7 @@ data.mapPartition(new MapPartitionFunction<String, Long>() {
       <td>
         <p>Evaluates a boolean function for each element and retains those for which the function
         returns true.<br/>
-        
+
         <strong>IMPORTANT:</strong> The system assumes that the function does not modify the elements on which the predicate is applied. Violating this assumption
         can lead to incorrect results.
         </p>
@@ -617,14 +627,14 @@ DataSet<Tuple3<Integer, String, Double>> output = input.sum(0).andMin(2);
     <tr>
       <td><strong>Distinct</strong></td>
       <td>
-        <p>Returns the distinct elements of a data set. It removes the duplicate entries 
+        <p>Returns the distinct elements of a data set. It removes the duplicate entries
         from the input DataSet, with respect to all fields of the elements, or a subset of fields.</p>
     {% highlight java %}
-        data.distinct(); 
+        data.distinct();
     {% endhighlight %}
       </td>
     </tr>
-    
+
     <tr>
       <td><strong>Join</strong></td>
       <td>
@@ -639,11 +649,11 @@ result = input1.join(input2)
 {% endhighlight %}
         You can specify the way that the runtime executes the join via <i>Join Hints</i>. The hints
         describe whether the join happens through partitioning or broadcasting, and whether it uses
-        a sort-based or a hash-based algorithm. Please refer to the 
+        a sort-based or a hash-based algorithm. Please refer to the
         <a href="dataset_transformations.html#join-algorithm-hints">Transformations Guide</a> for
         a list of possible hints and an example.</br>
         If no hint is specified, the system will try to make an estimate of the input sizes and
-        pick a the best strategy according to those estimates. 
+        pick a the best strategy according to those estimates.
 {% highlight java %}
 // This executes a join by broadcasting the first data set
 // using a hash table for the broadcasted data
@@ -664,7 +674,7 @@ input1.leftOuterJoin(input2) // rightOuterJoin or fullOuterJoin for right or ful
       .equalTo(1)            // key of the second input (tuple field 1)
       .with(new JoinFunction<String, String, String>() {
           public String join(String v1, String v2) {
-             // NOTE: 
+             // NOTE:
              // - v2 might be null for leftOuterJoin
              // - v1 might be null for rightOuterJoin
              // - v1 OR v2 might be null for fullOuterJoin
@@ -767,8 +777,8 @@ DataSet<Integer> result = in.partitionCustom(Partitioner<K> partitioner, key)
     <tr>
       <td><strong>Sort Partition</strong></td>
       <td>
-        <p>Locally sorts all partitions of a data set on a specified field in a specified order. 
-          Fields can be specified as tuple positions or field expressions. 
+        <p>Locally sorts all partitions of a data set on a specified field in a specified order.
+          Fields can be specified as tuple positions or field expressions.
           Sorting on multiple fields is done by chaining sortPartition() calls.</p>
 {% highlight java %}
 DataSet<Tuple2<String,Integer>> in = // [...]
@@ -920,16 +930,16 @@ val output: DataSet[(Int, String, Doublr)] = input.sum(0).min(2)
 {% endhighlight %}
       </td>
     </tr>
-    
+
     <tr>
       <td><strong>Distinct</strong></td>
       <td>
-        <p>Returns the distinct elements of a data set. It removes the duplicate entries 
+        <p>Returns the distinct elements of a data set. It removes the duplicate entries
         from the input DataSet, with respect to all fields of the elements, or a subset of fields.</p>
       {% highlight scala %}
-         data.distinct() 
+         data.distinct()
       {% endhighlight %}
-      </td> 
+      </td>
     </tr>
 
     </tr>
@@ -946,7 +956,7 @@ val result = input1.join(input2).where(0).equalTo(1)
 {% endhighlight %}
         You can specify the way that the runtime executes the join via <i>Join Hints</i>. The hints
         describe whether the join happens through partitioning or broadcasting, and whether it uses
-        a sort-based or a hash-based algorithm. Please refer to the 
+        a sort-based or a hash-based algorithm. Please refer to the
         <a href="dataset_transformations.html#join-algorithm-hints">Transformations Guide</a> for
         a list of possible hints and an example.</br>
         If no hint is specified, the system will try to make an estimate of the input sizes and
@@ -1057,8 +1067,8 @@ val result = in
     <tr>
       <td><strong>Sort Partition</strong></td>
       <td>
-        <p>Locally sorts all partitions of a data set on a specified field in a specified order. 
-          Fields can be specified as tuple positions or field expressions. 
+        <p>Locally sorts all partitions of a data set on a specified field in a specified order.
+          Fields can be specified as tuple positions or field expressions.
           Sorting on multiple fields is done by chaining sortPartition() calls.</p>
 {% highlight scala %}
 val in: DataSet[(Int, String)] = // [...]
@@ -1094,7 +1104,7 @@ possible for [Data Sources](#data-sources) and [Data Sinks](#data-sinks).
 
 `withParameters(Configuration)` passes Configuration objects, which can be accessed from the `open()` method inside the user function.
 
-[Back to Top](#top)
+{% top %}
 
 
 Specifying Keys
@@ -1208,7 +1218,7 @@ In the example below, we have a `WC` POJO with two fields "word" and "count". To
 {% highlight java %}
 // some ordinary POJO (Plain old Java Object)
 public class WC {
-  public String word; 
+  public String word;
   public int count;
 }
 DataSet<WC> words = // [...]
@@ -1317,8 +1327,8 @@ These are valid field expressions for the example code above:
 ### Define keys using Key Selector Functions
 {:.no_toc}
 
-An additional way to define keys are "key selector" functions. A key selector function 
-takes a single dataset element as input and returns the key for the element. The key can be of any type and be derived from arbitrary computations. 
+An additional way to define keys are "key selector" functions. A key selector function
+takes a single dataset element as input and returns the key for the element. The key can be of any type and be derived from arbitrary computations.
 
 The following example shows a key selector function that simply returns the field of an object:
 
@@ -1349,7 +1359,7 @@ val wordCounts = words
 </div>
 
 
-[Back to top](#top)
+{% top %}
 
 
 Passing Functions to Flink
@@ -1382,7 +1392,7 @@ data.map(new MapFunction<String, Integer> () {
 
 #### Java 8 Lambdas
 
-Flink also supports Java 8 Lambdas in the Java API. Please see the full [Java 8 Guide](java8.html).
+Flink also supports Java 8 Lambdas in the Java API. Please see the full [Java 8 Guide]({{ site.baseurl }}/apis/java8.html).
 
 {% highlight java %}
 DataSet<String> data = // [...]
@@ -1495,7 +1505,7 @@ the
 [transformations documentation](dataset_transformations.html)
 for a complete example.
 
-[Back to top](#top)
+{% top %}
 
 
 Data Types
@@ -1520,7 +1530,7 @@ There are six different categories of data types:
 <div class="codetabs" markdown="1">
 <div data-lang="java" markdown="1">
 
-Tuples are composite types that contain a fixed number of fields with various types. 
+Tuples are composite types that contain a fixed number of fields with various types.
 The Java API provides classes from `Tuple1` up to `Tuple25`. Every field of a tuple
 can be an arbitrary Flink type including further tuples, resulting in nested tuples. Fields of a
 tuple can be accessed directly using the field's name as `tuple.f4`, or using the generic getter method
@@ -1634,16 +1644,16 @@ wordCounts groupBy { _.word } reduce(new MyReduceFunction())
 
 #### Primitive Types
 
-Flink supports all Java and Scala primitive types such as `Integer`, `String`, and `Double`. 
+Flink supports all Java and Scala primitive types such as `Integer`, `String`, and `Double`.
 
 #### General Class Types
 
-Flink supports most Java and Scala classes (API and custom). 
+Flink supports most Java and Scala classes (API and custom).
 Restrictions apply to classes containing fields that cannot be serialized, like file pointers, I/O streams, or other native
 resources. Classes that follow the Java Beans conventions work well in general.
 
-All classes that are not identified as POJO types (see POJO requirements above) are handled by Flink as general class types. 
-Flink treats these data types as black boxes and is not able to access their their content (i.e., for efficient sorting). General types are de/serialized using the serialization framework [Kryo](https://github.com/EsotericSoftware/kryo). 
+All classes that are not identified as POJO types (see POJO requirements above) are handled by Flink as general class types.
+Flink treats these data types as black boxes and is not able to access their their content (i.e., for efficient sorting). General types are de/serialized using the serialization framework [Kryo](https://github.com/EsotericSoftware/kryo).
 
 When grouping, sorting, or joining a data set of generic types, keys must be specified with key selector functions. See the [key definition section](#specifying-keys) or [data transformation section](#transformations) for details.
 
@@ -1721,7 +1731,7 @@ There is a switch at the `ExectionConfig` which allows users to enable the objec
 
 
 
-[Back to top](#top)
+{% top %}
 
 
 Data Sources
@@ -1750,16 +1760,16 @@ File-based:
 
 - `readFileOfPrimitives(path, Class)` / `PrimitiveInputFormat` - Parses files of new-line (or another char sequence)
   delimited primitive data types such as `String` or `Integer`.
-   
+
 - `readFileOfPrimitives(path, delimiter, Class)` / `PrimitiveInputFormat` - Parses files of new-line (or another char sequence)
    delimited primitive data types such as `String` or `Integer` using the given delimiter.
-   
-- `readHadoopFile(FileInputFormat, Key, Value, path)` / `FileInputFormat` - Creates a JobConf and reads file from the specified 
+
+- `readHadoopFile(FileInputFormat, Key, Value, path)` / `FileInputFormat` - Creates a JobConf and reads file from the specified
    path with the specified FileInputFormat, Key class and Value class and returns them as Tuple2<Key, Value>.
-   
+
 - `readSequenceFile(Key, Value, path)` / `SequenceFileInputFormat` - Creates a JobConf and reads file from the specified path with
    type SequenceFileInputFormat, Key class and Value class and returns them as Tuple2<Key, Value>.
- 
+
 
 Collection-based:
 
@@ -1807,12 +1817,12 @@ DataSet<Tuple2<String, Double>> csvInput = env.readCsvFile("hdfs:///the/CSV/file
 // read a CSV file with three fields into a POJO (Person.class) with corresponding fields
 DataSet<Person>> csvInput = env.readCsvFile("hdfs:///the/CSV/file")
                          .pojoType(Person.class, "name", "age", "zipcode");  
-                                                 
 
-// read a file from the specified path of type TextInputFormat 
+
+// read a file from the specified path of type TextInputFormat
 DataSet<Tuple2<LongWritable, Text>> tuples =
  env.readHadoopFile(new TextInputFormat(), LongWritable.class, Text.class, "hdfs://nnHost:nnPort/path/to/file");
-                         
+
 // read a file from the specified path of type SequenceFileInputFormat
 DataSet<Tuple2<IntWritable, Text>> tuples =
  env.readSequenceFile(IntWritable.class, Text.class, "hdfs://nnHost:nnPort/path/to/file");
@@ -1824,7 +1834,7 @@ DataSet<String> value = env.fromElements("Foo", "bar", "foobar", "fubar");
 DataSet<Long> numbers = env.generateSequence(1, 10000000);
 
 // Read data from a relational database using the JDBC input format
-DataSet<Tuple2<String, Integer> dbData = 
+DataSet<Tuple2<String, Integer> dbData =
     env.createInput(
       // create and configure input format
       JDBCInputFormat.buildJDBCInputFormat()
@@ -1905,10 +1915,10 @@ File-based:
 
 - `readFileOfPrimitives(path, delimiter)` / `PrimitiveInputFormat` - Parses files of new-line (or another char sequence)
   delimited primitive data types such as `String` or `Integer` using the given delimiter.
-  
-- `readHadoopFile(FileInputFormat, Key, Value, path)` / `FileInputFormat` - Creates a JobConf and reads file from the specified 
+
+- `readHadoopFile(FileInputFormat, Key, Value, path)` / `FileInputFormat` - Creates a JobConf and reads file from the specified
    path with the specified FileInputFormat, Key class and Value class and returns them as Tuple2<Key, Value>.
-   
+
 - `readSequenceFile(Key, Value, path)` / `SequenceFileInputFormat` - Creates a JobConf and reads file from the specified path with
    type SequenceFileInputFormat, Key class and Value class and returns them as Tuple2<Key, Value>.  
 
@@ -1971,10 +1981,10 @@ val values = env.fromElements("Foo", "bar", "foobar", "fubar")
 // generate a number sequence
 val numbers = env.generateSequence(1, 10000000);
 
-// read a file from the specified path of type TextInputFormat 
+// read a file from the specified path of type TextInputFormat
 val tuples = env.readHadoopFile(new TextInputFormat, classOf[LongWritable],
  classOf[Text], "hdfs://nnHost:nnPort/path/to/file")
-                         
+
 // read a file from the specified path of type SequenceFileInputFormat
 val tuples = env.readSequenceFile(classOf[IntWritable], classOf[Text],
  "hdfs://nnHost:nnPort/path/to/file")
@@ -2054,7 +2064,7 @@ The following table lists the currently supported compression methods.
 </table>
 
 
-[Back to top](#top)
+{% top %}
 
 
 Execution Configuration
@@ -2088,13 +2098,13 @@ With the closure cleaner disabled, it might happen that an anonymous user functi
 
 - `getExecutionRetryDelay()` / `setExecutionRetryDelay(long executionRetryDelay)` Sets the delay in milliseconds that the system waits after a job has failed, before re-executing it. The delay starts after all tasks have been successfully been stopped on the TaskManagers, and once the delay is past, the tasks are re-started. This parameter is useful to delay re-execution in order to let certain time-out related failures surface fully (like broken connections that have not fully timed out), before attempting a re-execution and immediately failing again due to the same problem. This parameter only has an effect if the number of execution re-tries is one or more.
 
-- `getExecutionMode()` / `setExecutionMode()`. The default execution mode is PIPELINED. Sets the execution mode to execute the program. The execution mode defines whether data exchanges are performed in a batch or on a pipelined manner. 
+- `getExecutionMode()` / `setExecutionMode()`. The default execution mode is PIPELINED. Sets the execution mode to execute the program. The execution mode defines whether data exchanges are performed in a batch or on a pipelined manner.
 
 - `enableForceKryo()` / **`disableForceKryo`**. Kryo is not forced by default. Forces the GenericTypeInformation to use the Kryo serializer for POJOS even though we could analyze them as a POJO. In some cases this might be preferable. For example, when Flink's internal serializers fail to handle a POJO properly.
 
 - `enableForceAvro()` / **`disableForceAvro()`**. Avro is not forced by default. Forces the Flink AvroTypeInformation to use the Avro serializer instead of Kryo for serializing Avro POJOs.
 
-- `enableObjectReuse()` / **`disableObjectReuse()`** By default, objects are not reused in Flink. Enabling the [object reuse mode](programming_guide.html#object-reuse-behavior) will instruct the runtime to reuse user objects for better performance. Keep in mind that this can lead to bugs when the user-code function of an operation is not aware of this behavior. 
+- `enableObjectReuse()` / **`disableObjectReuse()`** By default, objects are not reused in Flink. Enabling the [object reuse mode](#object-reuse-behavior) will instruct the runtime to reuse user objects for better performance. Keep in mind that this can lead to bugs when the user-code function of an operation is not aware of this behavior.
 
 - **`enableSysoutLogging()`** / `disableSysoutLogging()` JobManager status updates are printed to `System.out` by default. This setting allows to disable this behavior.
 
@@ -2108,7 +2118,7 @@ With the closure cleaner disabled, it might happen that an anonymous user functi
 
 - `registerKryoType(Class<?> type)` If the type ends up being serialized with Kryo, then it will be registered at Kryo to make sure that only tags (integer IDs) are written. If a type is not registered with Kryo, its entire class-name will be serialized with every instance, leading to much higher I/O costs.
 
-- `registerPojoType(Class<?> type)` Registers the given type with the serialization stack. If the type is eventually serialized as a POJO, then the type is registered with the POJO serializer. If the type ends up being serialized with Kryo, then it will be registered at Kryo to make sure that only tags are written. If a type is not registered with Kryo, its entire class-name will be serialized with every instance, leading to much higher I/O costs. 
+- `registerPojoType(Class<?> type)` Registers the given type with the serialization stack. If the type is eventually serialized as a POJO, then the type is registered with the POJO serializer. If the type ends up being serialized with Kryo, then it will be registered at Kryo to make sure that only tags are written. If a type is not registered with Kryo, its entire class-name will be serialized with every instance, leading to much higher I/O costs.
 
 Note that types registered with `registerKryoType()` are not available to Flink's Kryo serializer instance.
 
@@ -2119,7 +2129,7 @@ Note that types registered with `registerKryoType()` are not available to Flink'
 The `RuntimeContext` which is accessible in `Rich*` functions through the `getRuntimeContext()` method also allows to access the `ExecutionConfig` in all user defined functions.
 
 
-[Back to top](#top)
+{% top %}
 
 Data Sinks
 ----------
@@ -2156,7 +2166,7 @@ same time run additional transformations on them.
 Standard data sink methods:
 
 {% highlight java %}
-// text data 
+// text data
 DataSet<String> textData = // [...]
 
 // write DataSet to a file on the local file system
@@ -2258,7 +2268,7 @@ same time run additional transformations on them.
 Standard data sink methods:
 
 {% highlight scala %}
-// text data 
+// text data
 val textData: DataSet[String] = // [...]
 
 // write DataSet to a file on the local file system
@@ -2317,7 +2327,7 @@ Globally sorted output is not supported yet.
 </div>
 </div>
 
-[Back to top](#top)
+{% top %}
 
 Debugging
 ---------
@@ -2422,7 +2432,7 @@ val myLongs = env.fromCollection(longIt)
 `Serializable`. Furthermore, collection data sources can not be executed in parallel (
 parallelism = 1).
 
-[Back to top](#top)
+{% top %}
 
 Iteration Operators
 -------------------
@@ -2565,7 +2575,7 @@ val env = ExecutionEnvironment.getExecutionEnvironment()
 val initial = env.fromElements(0)
 
 val count = initial.iterate(10000) { iterationInput: DataSet[Int] =>
-  val result = iterationInput.map { i => 
+  val result = iterationInput.map { i =>
     val x = Math.random()
     val y = Math.random()
     i + (if (x * x + y * y < 1) 1 else 0)
@@ -2632,24 +2642,24 @@ env.execute()
 </div>
 </div>
 
-[Back to top](#top)
+{% top %}
 
 
 Semantic Annotations
 -----------
 
-Semantic annotations can be used to give Flink hints about the behavior of a function. 
+Semantic annotations can be used to give Flink hints about the behavior of a function.
 They tell the system which fields of a function's input the function reads and evaluates and
-which fields it unmodified forwards from its input to its output. 
+which fields it unmodified forwards from its input to its output.
 Semantic annotations are a powerful means to speed up execution, because they
 allow the system to reason about reusing sort orders or partitions across multiple operations. Using
 semantic annotations may eventually save the program from unnecessary data shuffling or unnecessary
 sorts and significantly improve the performance of a program.
 
-**Note:** The use of semantic annotations is optional. However, it is absolutely crucial to 
-be conservative when providing semantic annotations! 
-Incorrect semantic annotations will cause Flink to make incorrect assumptions about your program and 
-might eventually lead to incorrect results. 
+**Note:** The use of semantic annotations is optional. However, it is absolutely crucial to
+be conservative when providing semantic annotations!
+Incorrect semantic annotations will cause Flink to make incorrect assumptions about your program and
+might eventually lead to incorrect results.
 If the behavior of an operator is not clearly predictable, no annotation should be provided.
 Please read the documentation carefully.
 
@@ -2657,15 +2667,15 @@ The following semantic annotations are currently supported.
 
 #### Forwarded Fields Annotation
 
-Forwarded fields information declares input fields which are unmodified forwarded by a function to the same position or to another position in the output. 
-This information is used by the optimizer to infer whether a data property such as sorting or 
+Forwarded fields information declares input fields which are unmodified forwarded by a function to the same position or to another position in the output.
+This information is used by the optimizer to infer whether a data property such as sorting or
 partitioning is preserved by a function.
 For functions that operate on groups of input elements such as `GroupReduce`, `GroupCombine`, `CoGroup`, and `MapPartition`, all fields that are defined as forwarded fields must always be jointly forwarded from the same input element. The forwarded fields of each element that is emitted by a group-wise function may originate from a different element of the function's input group.
 
 Field forward information is specified using [field expressions](#define-keys-using-field-expressions).
-Fields that are forwarded to the same position in the output can be specified by their position. 
+Fields that are forwarded to the same position in the output can be specified by their position.
 The specified position must be valid for the input and output data type and have the same type.
-For example the String `"f2"` declares that the third field of a Java input tuple is always equal to the third field in the output tuple. 
+For example the String `"f2"` declares that the third field of a Java input tuple is always equal to the third field in the output tuple.
 
 Fields which are unmodified forwarded to another position in the output are declared by specifying the
 source field in the input and the target field in the output as field expressions.
@@ -2674,7 +2684,7 @@ unchanged copied to the third field of the Java output tuple. The wildcard expre
 
 Multiple forwarded fields can be declared in a single String by separating them with semicolons as `"f0; f2->f1; f3->f2"` or in separate Strings `"f0", "f2->f1", "f3->f2"`. When specifying forwarded fields it is not required that all forwarded fields are declared, but all declarations must be correct.
 
-Forwarded field information can be declared by attaching Java annotations on function class definitions or 
+Forwarded field information can be declared by attaching Java annotations on function class definitions or
 by passing them as operator arguments after invoking a function on a DataSet as shown below.
 
 ##### Function Class Annotations
@@ -2686,10 +2696,10 @@ by passing them as operator arguments after invoking a function on a DataSet as
 ##### Operator Arguments
 
 * `data.map(myMapFnc).withForwardedFields()` for single input function such as Map and Reduce.
-* `data1.join(data2).where().equalTo().with(myJoinFnc).withForwardFieldsFirst()` for the first input of a function with two inputs such as Join and CoGroup. 
+* `data1.join(data2).where().equalTo().with(myJoinFnc).withForwardFieldsFirst()` for the first input of a function with two inputs such as Join and CoGroup.
 * `data1.join(data2).where().equalTo().with(myJoinFnc).withForwardFieldsSecond()` for the second input of a function with two inputs such as Join and CoGroup.
 
-Please note that it is not possible to overwrite field forward information which was specified as a class annotation by operator arguments. 
+Please note that it is not possible to overwrite field forward information which was specified as a class annotation by operator arguments.
 
 ##### Example
 
@@ -2699,7 +2709,7 @@ The following example shows how to declare forwarded field information using a f
 <div data-lang="java" markdown="1">
 {% highlight java %}
 @ForwardedFields("f0->f2")
-public class MyMap implements 
+public class MyMap implements
               MapFunction<Tuple2<Integer, Integer>, Tuple3<String, Integer, Integer>> {
   @Override
   public Tuple3<String, Integer, Integer> map(Tuple2<Integer, Integer> val) {
@@ -2723,17 +2733,17 @@ class MyMap extends MapFunction[(Int, Int), (String, Int, Int)]{
 
 #### Non-Forwarded Fields
 
-Non-forwarded fields information declares all fields which are not preserved on the same position in a function's output. 
-The values of all other fields are considered to be preserved at the same position in the output. 
-Hence, non-forwarded fields information is inverse to forwarded fields information. 
+Non-forwarded fields information declares all fields which are not preserved on the same position in a function's output.
+The values of all other fields are considered to be preserved at the same position in the output.
+Hence, non-forwarded fields information is inverse to forwarded fields information.
 Non-forwarded field information for group-wise operators such as `GroupReduce`, `GroupCombine`, `CoGroup`, and `MapPartition` must fulfill the same requirements as for forwarded field information.
 
-**IMPORTANT**: The specification of non-forwarded fields information is optional. However if used, 
+**IMPORTANT**: The specification of non-forwarded fields information is optional. However if used,
 **ALL!** non-forwarded fields must be specified, because all other fields are considered to be forwarded in place. It is safe to declare a forwarded field as non-forwarded.
 
-Non-forwarded fields are specified as a list of [field expressions](#define-keys-using-field-expressions). The list can be either given as a single String with field expressions separated by semicolons or as multiple Strings. 
-For example both `"f1; f3"` and `"f1", "f3"` declare that the second and fourth field of a Java tuple 
-are not preserved in place and all other fields are preserved in place. 
+Non-forwarded fields are specified as a list of [field expressions](#define-keys-using-field-expressions). The list can be either given as a single String with field expressions separated by semicolons or as multiple Strings.
+For example both `"f1; f3"` and `"f1", "f3"` declare that the second and fourth field of a Java tuple
+are not preserved in place and all other fields are preserved in place.
 Non-forwarded field information can only be specified for functions which have identical input and output types.
 
 Non-forwarded field information is specified as function class annotations using the following annotations:
@@ -2750,7 +2760,7 @@ The following example shows how to declare non-forwarded field information:
 <div data-lang="java" markdown="1">
 {% highlight java %}
 @NonForwardedFields("f1") // second field is not forwarded
-public class MyMap implements 
+public class MyMap implements
               MapFunction<Tuple2<Integer, Integer>, Tuple2<Integer, Integer>> {
   @Override
   public Tuple2<Integer, Integer> map(Tuple2<Integer, Integer> val) {
@@ -2779,10 +2789,10 @@ all fields that are used by the function to compute its result.
 For example, fields which are evaluated in conditional statements or used for computations must be marked as read when specifying read fields information.
 Fields which are only unmodified forwarded to the output without evaluating their values or fields which are not accessed at all are not considered to be read.
 
-**IMPORTANT**: The specification of read fields information is optional. However if used, 
+**IMPORTANT**: The specification of read fields information is optional. However if used,
 **ALL!** read fields must be specified. It is safe to declare a non-read field as read.
 
-Read fields are specified as a list of [field expressions](#define-keys-using-field-expressions). The list can be either given as a single String with field expressions separated by semicolons or as multiple Strings. 
+Read fields are specified as a list of [field expressions](#define-keys-using-field-expressions). The list can be either given as a single String with field expressions separated by semicolons or as multiple Strings.
 For example both `"f1; f3"` and `"f1", "f3"` declare that the second and fourth field of a Java tuple are read and evaluated by the function.
 
 Read field information is specified as function class annotations using the following annotations:
@@ -2798,9 +2808,9 @@ The following example shows how to declare read field information:
 <div class="codetabs" markdown="1">
 <div data-lang="java" markdown="1">
 {% highlight java %}
-@ReadFields("f0; f3") // f0 and f3 are read and evaluated by the function. 
-public class MyMap implements 
-              MapFunction<Tuple4<Integer, Integer, Integer, Integer>, 
+@ReadFields("f0; f3") // f0 and f3 are read and evaluated by the function.
+public class MyMap implements
+              MapFunction<Tuple4<Integer, Integer, Integer, Integer>,
                           Tuple2<Integer, Integer>> {
   @Override
   public Tuple2<Integer, Integer> map(Tuple4<Integer, Integer, Integer, Integer> val) {
@@ -2830,7 +2840,7 @@ class MyMap extends MapFunction[(Int, Int, Int, Int), (Int, Int)]{
 </div>
 </div>
 
-[Back to top](#top)
+{% top %}
 
 
 Broadcast Variables
@@ -2903,14 +2913,14 @@ accessing broadcasted data sets. For a complete example program, have a look at
 too large. For simpler things like scalar values you can simply make parameters part of the closure
 of a function, or use the `withParameters(...)` method to pass in a configuration.
 
-[Back to top](#top)
+{% top %}
 
 Passing Parameters to Functions
 -------------------
 
 Parameters can be passed to functions using either the constructor or the `withParameters(Configuration)` method. The parameters are serialized as part of the function object and shipped to all parallel task instances.
 
-Check also the [best practices guide on how to pass command line arguments to functions](best_practices.html#parsing-command-line-arguments-and-passing-them-around-in-your-flink-application).
+Check also the [best practices guide on how to pass command line arguments to functions]({{ site.baseurl }}/apis/best_practices.html#parsing-command-line-arguments-and-passing-them-around-in-your-flink-application).
 
 #### Via Constructor
 
@@ -3049,7 +3059,7 @@ public static final class Tokenizer extends RichFlatMapFunction<String, Tuple2<S
     // ... more here ...
 {% endhighlight %}
 
-[Back to top](#top)
+{% top %}
 
 Program Packaging and Distributed Execution
 -----------------------------------------
@@ -3057,7 +3067,7 @@ Program Packaging and Distributed Execution
 As described in the [program skeleton](#program-skeleton) section, Flink programs can be executed on
 clusters by using the `RemoteEnvironment`. Alternatively, programs can be packaged into JAR Files
 (Java Archives) for execution. Packaging the program is a prerequisite to executing them through the
-[command line interface](cli.html) or the [web interface](web_client.html).
+[command line interface]({{ site.baseurl }}/apis/cli.html) or the [web interface]({{ site.baseurl }}/apis/web_client.html).
 
 #### Packaging Programs
 
@@ -3104,7 +3114,7 @@ calls the `getPlan(String...)` method to obtain the program plan to execute.
 3. If the entry point class does not implement the `org.apache.flinkapi.common.Program` interface,
 the system will invoke the main method of the class.
 
-[Back to top](#top)
+{% top %}
 
 Accumulators & Counters
 ---------------------------
@@ -3123,7 +3133,7 @@ interface.
 
 - {% gh_link /flink-core/src/main/java/org/apache/flink/api/common/accumulators/IntCounter.java "__IntCounter__" %},
   {% gh_link /flink-core/src/main/java/org/apache/flink/api/common/accumulators/LongCounter.java "__LongCounter__" %}
-  and {% gh_link /flink-core/src/main/java/org/apache/flink/api/common/accumulators/DoubleCounter.java "__DoubleCounter__" %}: 
+  and {% gh_link /flink-core/src/main/java/org/apache/flink/api/common/accumulators/DoubleCounter.java "__DoubleCounter__" %}:
   See below for an example using a counter.
 - {% gh_link /flink-core/src/main/java/org/apache/flink/api/common/accumulators/Histogram.java "__Histogram__" %}:
   A histogram implementation for a discrete number of bins. Internally it is just a map from Integer
@@ -3186,7 +3196,7 @@ or {% gh_link /flink-core/src/main/java/org/apache/flink/api/common/accumulators
 result type ```R``` for the final result. E.g. for a histogram, ```V``` is a number and ```R``` i
  a histogram. ```SimpleAccumulator``` is for the cases where both types are the same, e.g. for counters.
 
-[Back to top](#top)
+{% top %}
 
 Parallel Execution
 ------------------
@@ -3339,7 +3349,7 @@ A system-wide default parallelism for all execution environments can be defined
 `parallelism.default` property in `./conf/flink-conf.yaml`. See the
 [Configuration]({{ site.baseurl }}/setup/config.html) documentation for details.
 
-[Back to top](#top)
+{% top %}
 
 Execution Plans
 ---------------
@@ -3398,4 +3408,4 @@ The script to start the webinterface is located under ```bin/start-webclient.sh`
 
 You are able to specify program arguments in the textbox at the bottom of the page. Checking the plan visualization checkbox shows the execution plan before executing the actual program.
 
-[Back to top](#top)
+{% top %}
diff --git a/docs/apis/iterations.md b/docs/apis/batch/iterations.md
similarity index 95%
rename from docs/apis/iterations.md
rename to docs/apis/batch/iterations.md
index 54d1b241b1c01..912f378daae6c 100644
--- a/docs/apis/iterations.md
+++ b/docs/apis/batch/iterations.md
@@ -1,5 +1,9 @@
 ---
 title:  "Iterations"
+
+# Sub-level navigation
+sub-nav-group: batch
+sub-nav-pos: 3
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -24,7 +28,7 @@ Iterative algorithms occur in many domains of data analysis, such as *machine le
 
 Flink programs implement iterative algorithms by defining a **step function** and embedding it into a special iteration operator. There are two  variants of this operator: **Iterate** and **Delta Iterate**. Both operators repeatedly invoke the step function on the current iteration state until a certain termination condition is reached.
 
-Here, we provide background on both operator variants and outline their usage. The [programming guide](programming_guide.html) explains how to implement the operators in both Scala and Java. We also support both **vertex-centric and gather-sum-apply iterations** through Flink's graph processing API, [Gelly]({{site.baseurl}}/libs/gelly_guide.html).
+Here, we provide background on both operator variants and outline their usage. The [programming guide](index.html) explains how to implement the operators in both Scala and Java. We also support both **vertex-centric and gather-sum-apply iterations** through Flink's graph processing API, [Gelly]({{site.baseurl}}/libs/gelly_guide.html).
 
 The following table provides an overview of both operators:
 
@@ -113,7 +117,7 @@ setFinalState(state);
 
 <div class="panel panel-default">
 	<div class="panel-body">
-	See the <strong><a href="programming_guide.html">Programming Guide</a> </strong> for details and code examples.</div>
+	See the <strong><a href="index.html">Programming Guide</a> </strong> for details and code examples.</div>
 </div>
 
 ### Example: Incrementing Numbers
@@ -176,7 +180,7 @@ setFinalState(solution);
 
 <div class="panel panel-default">
 	<div class="panel-body">
-	See the <strong><a href="programming_guide.html">programming guide</a></strong> for details and code examples.</div>
+	See the <strong><a href="index.html">programming guide</a></strong> for details and code examples.</div>
 </div>
 
 ### Example: Propagate Minimum in Graph
diff --git a/docs/apis/python.md b/docs/apis/batch/python.md
similarity index 95%
rename from docs/apis/python.md
rename to docs/apis/batch/python.md
index d57e11765de21..74da97e933774 100644
--- a/docs/apis/python.md
+++ b/docs/apis/batch/python.md
@@ -1,6 +1,12 @@
 ---
 title: "Python Programming Guide"
 is_beta: true
+
+# Sub-level navigation
+sub-nav-group: batch
+sub-nav-id: python_api
+sub-nav-pos: 4
+sub-nav-title: Python API
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -21,8 +27,6 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-<a href="#top"></a>
-
 Analysis programs in Flink are regular programs that implement transformations on data sets
 (e.g., filtering, mapping, joining, grouping). The data sets are initially created from certain
 sources (e.g., by reading files, or from collections). Results are returned via sinks, which may for
@@ -59,17 +63,17 @@ if __name__ == "__main__":
   env = get_environment()
   data = env.from_elements("Who's there?",
    "I think I hear them. Stand, ho! Who's there?")
-  
+
   data \
     .flat_map(lambda x, c: [(1, word) for word in x.lower().split()], (INT, STRING)) \
     .group_by(1) \
     .reduce_group(Adder(), (INT, STRING), combinable=True) \
     .output()
-  
+
   env.execute(local=True)
 {% endhighlight %}
 
-[Back to top](#top)
+{% top %}
 
 Program Skeleton
 ----------------
@@ -84,7 +88,7 @@ programs with a `if __name__ == "__main__":` block. Each program consists of the
 5. Execute your program.
 
 We will now give an overview of each of those steps but please refer to the respective sections for
-more details. 
+more details.
 
 
 The `Environment` is the basis for all Flink programs. You can
@@ -116,7 +120,7 @@ a map transformation looks like this:
 data.map(lambda x: x*2, INT)
 {% endhighlight %}
 
-This will create a new DataSet by doubling every value in the original DataSet. 
+This will create a new DataSet by doubling every value in the original DataSet.
 For more information and a list of all the transformations,
 please refer to [Transformations](#transformations).
 
@@ -133,15 +137,15 @@ The last method is only useful for developing/debugging on a local machine,
 it will output the contents of the DataSet to standard output. (Note that in
 a cluster, the result goes to the standard out stream of the cluster nodes and ends
 up in the *.out* files of the workers).
-The first two do as the name suggests. 
+The first two do as the name suggests.
 Please refer to [Data Sinks](#data-sinks) for more information on writing to files.
 
 Once you specified the complete program you need to call `execute` on
-the `Environment`. This will either execute on your local machine or submit your program 
+the `Environment`. This will either execute on your local machine or submit your program
 for execution on a cluster, depending on how Flink was started. You can force
 a local execution by using `execute(local=True)`.
 
-[Back to top](#top)
+{% top %}
 
 Project setup
 ---------------
@@ -150,7 +154,7 @@ Apart from setting up Flink, no additional work is required. The python package
 
 The Python API was tested on Linux systems that have Python 2.7 or 3.4 installed.
 
-[Back to top](#top)
+{% top %}
 
 Lazy Evaluation
 ---------------
@@ -164,7 +168,7 @@ on the environment of the program.
 The lazy evaluation lets you construct sophisticated programs that Flink executes as one
 holistically planned unit.
 
-[Back to top](#top)
+{% top %}
 
 
 Transformations
@@ -265,10 +269,10 @@ data.reduce_group(Adder(), (INT, STRING))
       <td><strong>Join</strong></td>
       <td>
         Joins two data sets by creating all pairs of elements that are equal on their keys.
-        Optionally uses a JoinFunction to turn the pair of elements into a single element. 
+        Optionally uses a JoinFunction to turn the pair of elements into a single element.
         See <a href="#specifying-keys">keys</a> on how to define join keys.
 {% highlight python %}
-# In this case tuple fields are used as keys. 
+# In this case tuple fields are used as keys.
 # "0" is the join field on the first tuple
 # "1" is the join field on the second tuple.
 result = input1.join(input2).where(0).equal_to(1)
@@ -311,7 +315,7 @@ data.union(data2)
   </tbody>
 </table>
 
-[Back to Top](#top)
+{% top %}
 
 
 Specifying Keys
@@ -344,7 +348,7 @@ reduced = data \
   .reduce_group(<do something>)
 {% endhighlight %}
 
-The data set is grouped on the first field of the tuples. 
+The data set is grouped on the first field of the tuples.
 The group-reduce function will thus receive groups of tuples with
 the same value in the first field.
 
@@ -361,7 +365,7 @@ with the same value for both fields.
 A note on nested Tuples: If you have a DataSet with a nested tuple
 specifying `group_by(<index of tuple>)` will cause the system to use the full tuple as a key.
 
-[Back to top](#top)
+{% top %}
 
 
 Passing Functions to Flink
@@ -381,15 +385,15 @@ class Filter(FilterFunction):
 data.filter(Filter())
 {% endhighlight %}
 
-Rich functions allow the use of imported functions, provide access to broadcast-variables, 
+Rich functions allow the use of imported functions, provide access to broadcast-variables,
 can be parameterized using __init__(), and are the go-to-option for complex functions.
 They are also the only way to define an optional `combine` function for a reduce operation.
 
 Lambda functions allow the easy insertion of one-liners. Note that a lambda function has to return
 an iterable, if the operation can return multiple values. (All functions receiving a collector argument)
 
-Flink requires type information at the time when it prepares the program for execution 
-(when the main method of the program is called). This is done by passing an exemplary 
+Flink requires type information at the time when it prepares the program for execution
+(when the main method of the program is called). This is done by passing an exemplary
 object that has the desired type. This holds also for tuples.
 
 {% highlight python %}
@@ -400,7 +404,7 @@ Would denote a tuple containing an int and a string. Note that for Operations th
 
 There are a few Constants defined in flink.plan.Constants that allow this in a more readable fashion.
 
-[Back to top](#top)
+{% top %}
 
 Data Types
 ----------
@@ -409,7 +413,7 @@ Flink's Python API currently only supports primitive python types (int, float, b
 
 #### Tuples/Lists
 
-You can use the tuples (or lists) for composite types. Python tuples are mapped to the Flink Tuple type, that contain 
+You can use the tuples (or lists) for composite types. Python tuples are mapped to the Flink Tuple type, that contain
 a fix number of fields of various types (up to 25). Every field of a tuple can be a primitive type - including further tuples, resulting in nested tuples.
 
 {% highlight python %}
@@ -428,7 +432,7 @@ wordCounts \
     .reduce(MyReduceFunction())
 {% endhighlight %}
 
-[Back to top](#top)
+{% top %}
 
 Data Sources
 ------------
@@ -464,7 +468,7 @@ csvInput = env.read_csv("hdfs:///the/CSV/file", (INT, STRING, DOUBLE))
 values = env.from_elements("Foo", "bar", "foobar", "fubar")
 {% endhighlight %}
 
-[Back to top](#top)
+{% top %}
 
 Data Sinks
 ----------
@@ -502,7 +506,7 @@ values.write_csv("file:///path/to/the/result/file", line_delimiter="\n", field_d
 values.write_text("file:///path/to/the/result/file")
 {% endhighlight %}
 
-[Back to top](#top)
+{% top %}
 
 Broadcast Variables
 -------------------
@@ -522,11 +526,11 @@ class MapperBcv(MapFunction):
         return value * factor
 
 # 1. The DataSet to be broadcasted
-toBroadcast = env.from_elements(1, 2, 3) 
+toBroadcast = env.from_elements(1, 2, 3)
 data = env.from_elements("a", "b")
 
 # 2. Broadcast the DataSet
-data.map(MapperBcv(), INT).with_broadcast_set("bcv", toBroadcast) 
+data.map(MapperBcv(), INT).with_broadcast_set("bcv", toBroadcast)
 {% endhighlight %}
 
 Make sure that the names (`bcv` in the previous example) match when registering and
@@ -535,7 +539,7 @@ accessing broadcasted data sets.
 **Note**: As the content of broadcast variables is kept in-memory on each node, it should not become
 too large. For simpler things like scalar values you can simply parameterize the rich function.
 
-[Back to top](#top)
+{% top %}
 
 Parallel Execution
 ------------------
@@ -576,31 +580,31 @@ env.execute()
 
 A system-wide default parallelism for all execution environments can be defined by setting the
 `parallelism.default` property in `./conf/flink-conf.yaml`. See the
-[Configuration](config.html) documentation for details.
+[Configuration]({{ site.baseurl }}/setup/config.html) documentation for details.
 
-[Back to top](#top)
+{% top %}
 
 Executing Plans
 ---------------
 
-To run the plan with Flink, go to your Flink distribution, and run the pyflink.sh script from the /bin folder. 
-use pyflink2.sh for python 2.7, and pyflink3.sh for python 3.4. The script containing the plan has to be passed 
-as the first argument, followed by a number of additional python packages, and finally, separated by - additional 
-arguments that will be fed to the script. 
+To run the plan with Flink, go to your Flink distribution, and run the pyflink.sh script from the /bin folder.
+use pyflink2.sh for python 2.7, and pyflink3.sh for python 3.4. The script containing the plan has to be passed
+as the first argument, followed by a number of additional python packages, and finally, separated by - additional
+arguments that will be fed to the script.
 
 {% highlight python %}
 ./bin/pyflink<2/3>.sh <Script>[ <pathToPackage1>[ <pathToPackageX]][ - <param1>[ <paramX>]]
 {% endhighlight %}
 
-[Back to top](#top)
+{% top %}
 
 Debugging
 ---------------
 
 If you are running Flink programs locally, you can debug your program following this guide.
 First you have to enable debugging by setting the debug switch in the `env.execute(debug=True)` call. After
-submitting your program, open the jobmanager log file, and look for a line that says 
+submitting your program, open the jobmanager log file, and look for a line that says
 `Waiting for external Process : <taskname>. Run python /tmp/flink/executor.py <port>` Now open `/tmp/flink` in your python
 IDE and run the `executor.py <port>`.
 
-[Back to top](#top)
+{% top %}
diff --git a/docs/apis/best_practices.md b/docs/apis/best_practices.md
index 9a082220bcdba..9444fb64adcf2 100644
--- a/docs/apis/best_practices.md
+++ b/docs/apis/best_practices.md
@@ -1,5 +1,8 @@
 ---
 title: "Best Practices"
+# Top-level navigation
+top-nav-group: apis
+top-nav-pos: 4
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -20,9 +23,6 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-<a href="#top"></a>
-
-
 This page contains a collection of best practices for Flink programmers on how to solve frequently encountered problems.
 
 
@@ -397,4 +397,3 @@ Next, you need to put the following jar files into the `lib/` folder:
  * `logback-classic.jar`
  * `logback-core.jar`
  * `log4j-over-slf4j.jar`: This bridge needs to be present in the classpath for redirecting logging calls from Hadoop (which is using Log4j) to Slf4j.
-
diff --git a/docs/apis/cli.md b/docs/apis/cli.md
index 78ea4b6f5c0ec..b67bc0d66f239 100644
--- a/docs/apis/cli.md
+++ b/docs/apis/cli.md
@@ -1,5 +1,8 @@
 ---
 title:  "Command-Line Interface"
+# Top-level navigation
+top-nav-group: apis
+top-nav-pos: 5
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/apis/cluster_execution.md b/docs/apis/cluster_execution.md
index 54e1b41e4d8f8..05ee68870ebf4 100644
--- a/docs/apis/cluster_execution.md
+++ b/docs/apis/cluster_execution.md
@@ -1,5 +1,8 @@
 ---
 title:  "Cluster Execution"
+# Top-level navigation
+top-nav-group: apis
+top-nav-pos: 8
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -98,9 +101,9 @@ The latter version is recommended as it respects the classloader management in F
 
 To provide these dependencies not included by Flink we suggest two options with Maven.
 
-1. The maven assembly plugin builds a so-called uber-jar(executable jar) 
+1. The maven assembly plugin builds a so-called uber-jar(executable jar)
 containing all your dependencies.
-Assembly configuration is straight-forward, but the resulting jar might become bulky. See 
+Assembly configuration is straight-forward, but the resulting jar might become bulky. See
 [usage](http://maven.apache.org/plugins/maven-assembly-plugin/usage.html).
 2. The maven unpack plugin, for unpacking the relevant parts of the dependencies and
 then package it with your code.
diff --git a/docs/apis/example_connectors.md b/docs/apis/connectors.md
similarity index 98%
rename from docs/apis/example_connectors.md
rename to docs/apis/connectors.md
index 1a66529ca7015..ebacb335794a1 100644
--- a/docs/apis/example_connectors.md
+++ b/docs/apis/connectors.md
@@ -1,5 +1,8 @@
 ---
-title:  "Connecting to other systems (Batch)"
+title:  "Connectors"
+# Top-level navigation
+top-nav-group: apis
+top-nav-pos: 3
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -20,7 +23,10 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-## Reading from file systems.
+* TOC
+{:toc}
+
+## Reading from file systems
 
 Flink has build-in support for the following file systems:
 
diff --git a/docs/apis/filesystems.md b/docs/apis/filesystems.md
new file mode 100644
index 0000000000000..e100cddbba1ec
--- /dev/null
+++ b/docs/apis/filesystems.md
@@ -0,0 +1,236 @@
+---
+title: "File Systems"
+# Top-level navigation
+top-nav-group: apis
+top-nav-pos: 9
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+## Reading from file systems.
+
+Flink has build-in support for the following file systems:
+
+| Filesystem                            | Scheme       | Notes  |
+| ------------------------------------- |--------------| ------ |
+| Hadoop Distributed File System (HDFS) &nbsp; | `hdfs://`    | All HDFS versions are supported |
+| Amazon S3                             | `s3://`      | Support through Hadoop file system implementation (see below) |
+| MapR file system                      | `maprfs://`  | The user has to manually place the required jar files in the `lib/` dir |
+| Tachyon                               | `tachyon://` &nbsp; | Support through Hadoop file system implementation (see below) |
+
+
+
+### Using Hadoop file system implementations
+
+Apache Flink allows users to use any file system implementing the `org.apache.hadoop.fs.FileSystem`
+interface. There are Hadoop `FileSystem` implementations for
+
+- [S3](https://aws.amazon.com/s3/) (tested)
+- [Google Cloud Storage Connector for Hadoop](https://cloud.google.com/hadoop/google-cloud-storage-connector) (tested)
+- [Tachyon](http://tachyon-project.org/) (tested)
+- [XtreemFS](http://www.xtreemfs.org/) (tested)
+- FTP via [Hftp](http://hadoop.apache.org/docs/r1.2.1/hftp.html) (not tested)
+- and many more.
+
+In order to use a Hadoop file system with Flink, make sure that
+
+- the `flink-conf.yaml` has set the `fs.hdfs.hadoopconf` property set to the Hadoop configuration directory.
+- the Hadoop configuration (in that directory) has an entry for the required file system. Examples for S3 and Tachyon are shown below.
+- the required classes for using the file system are available in the `lib/` folder of the Flink installation (on all machines running Flink). If putting the files into the directory is not possible, Flink is also respecting the `HADOOP_CLASSPATH` environment variable to add Hadoop jar files to the classpath.
+
+#### Amazon S3
+
+For Amazon S3 support add the following entries into the `core-site.xml` file:
+
+~~~xml
+<!-- configure the file system implementation -->
+<property>
+  <name>fs.s3.impl</name>
+  <value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
+</property>
+
+<!-- set your AWS ID -->
+<property>
+  <name>fs.s3.awsAccessKeyId</name>
+  <value>putKeyHere</value>
+</property>
+
+<!-- set your AWS access key -->
+<property>
+  <name>fs.s3.awsSecretAccessKey</name>
+  <value>putSecretHere</value>
+</property>
+~~~
+
+#### Tachyon
+
+For Tachyon support add the following entry into the `core-site.xml` file:
+
+~~~xml
+<property>
+  <name>fs.tachyon.impl</name>
+  <value>tachyon.hadoop.TFS</value>
+</property>
+~~~
+
+
+## Connecting to other systems using Input/OutputFormat wrappers for Hadoop
+
+Apache Flink allows users to access many different systems as data sources or sinks.
+The system is designed for very easy extensibility. Similar to Apache Hadoop, Flink has the concept
+of so called `InputFormat`s and `OutputFormat`s.
+
+One implementation of these `InputFormat`s is the `HadoopInputFormat`. This is a wrapper that allows
+users to use all existing Hadoop input formats with Flink.
+
+This section shows some examples for connecting Flink to other systems.
+[Read more about Hadoop compatibility in Flink](hadoop_compatibility.html).
+
+## Avro support in Flink
+
+Flink has extensive build-in support for [Apache Avro](http://avro.apache.org/). This allows to easily read from Avro files with Flink.
+Also, the serialization framework of Flink is able to handle classes generated from Avro schemas.
+
+In order to read data from an Avro file, you have to specify an `AvroInputFormat`.
+
+**Example**:
+
+~~~java
+AvroInputFormat<User> users = new AvroInputFormat<User>(in, User.class);
+DataSet<User> usersDS = env.createInput(users);
+~~~
+
+Note that `User` is a POJO generated by Avro. Flink also allows to perform string-based key selection of these POJOs. For example:
+
+~~~java
+usersDS.groupBy("name")
+~~~
+
+
+Note that using the `GenericData.Record` type is possible with Flink, but not recommended. Since the record contains the full schema, its very data intensive and thus probably slow to use.
+
+Flink's POJO field selection also works with POJOs generated from Avro. However, the usage is only possible if the field types are written correctly to the generated class. If a field is of type `Object` you can not use the field as a join or grouping key.
+Specifying a field in Avro like this `{"name": "type_double_test", "type": "double"},` works fine, however specifying it as a UNION-type with only one field (`{"name": "type_double_test", "type": ["double"]},`) will generate a field of type `Object`. Note that specifying nullable types (`{"name": "type_double_test", "type": ["null", "double"]},`) is possible!
+
+
+
+### Access Microsoft Azure Table Storage
+
+_Note: This example works starting from Flink 0.6-incubating_
+
+This example is using the `HadoopInputFormat` wrapper to use an existing Hadoop input format implementation for accessing [Azure's Table Storage](https://azure.microsoft.com/en-us/documentation/articles/storage-introduction/).
+
+1. Download and compile the `azure-tables-hadoop` project. The input format developed by the project is not yet available in Maven Central, therefore, we have to build the project ourselves.
+Execute the following commands:
+
+   ~~~bash
+   git clone https://github.com/mooso/azure-tables-hadoop.git
+   cd azure-tables-hadoop
+   mvn clean install
+   ~~~
+
+2. Setup a new Flink project using the quickstarts:
+
+   ~~~bash
+   curl https://flink.apache.org/q/quickstart.sh | bash
+   ~~~
+
+3. Add the following dependencies (in the `<dependencies>` section) to your `pom.xml` file:
+
+   ~~~xml
+   <dependency>
+       <groupId>org.apache.flink</groupId>
+       <artifactId>flink-hadoop-compatibility</artifactId>
+       <version>{{site.version}}</version>
+   </dependency>
+   <dependency>
+     <groupId>com.microsoft.hadoop</groupId>
+     <artifactId>microsoft-hadoop-azure</artifactId>
+     <version>0.0.4</version>
+   </dependency>
+   ~~~
+
+   `flink-hadoop-compatibility` is a Flink package that provides the Hadoop input format wrappers.
+   `microsoft-hadoop-azure` is adding the project we've build before to our project.
+
+The project is now prepared for starting to code. We recommend to import the project into an IDE, such as Eclipse or IntelliJ. (Import as a Maven project!).
+Browse to the code of the `Job.java` file. Its an empty skeleton for a Flink job.
+
+Paste the following code into it:
+
+~~~java
+import java.util.Map;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.hadoopcompatibility.mapreduce.HadoopInputFormat;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.mapreduce.Job;
+import com.microsoft.hadoop.azure.AzureTableConfiguration;
+import com.microsoft.hadoop.azure.AzureTableInputFormat;
+import com.microsoft.hadoop.azure.WritableEntity;
+import com.microsoft.windowsazure.storage.table.EntityProperty;
+
+public class AzureTableExample {
+
+  public static void main(String[] args) throws Exception {
+    // set up the execution environment
+    final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
+
+    // create a  AzureTableInputFormat, using a Hadoop input format wrapper
+    HadoopInputFormat<Text, WritableEntity> hdIf = new HadoopInputFormat<Text, WritableEntity>(new AzureTableInputFormat(), Text.class, WritableEntity.class, new Job());
+
+    // set the Account URI, something like: https://apacheflink.table.core.windows.net
+    hdIf.getConfiguration().set(AzureTableConfiguration.Keys.ACCOUNT_URI.getKey(), "TODO");
+    // set the secret storage key here
+    hdIf.getConfiguration().set(AzureTableConfiguration.Keys.STORAGE_KEY.getKey(), "TODO");
+    // set the table name here
+    hdIf.getConfiguration().set(AzureTableConfiguration.Keys.TABLE_NAME.getKey(), "TODO");
+
+    DataSet<Tuple2<Text, WritableEntity>> input = env.createInput(hdIf);
+    // a little example how to use the data in a mapper.
+    DataSet<String> fin = input.map(new MapFunction<Tuple2<Text,WritableEntity>, String>() {
+      @Override
+      public String map(Tuple2<Text, WritableEntity> arg0) throws Exception {
+        System.err.println("--------------------------------\nKey = "+arg0.f0);
+        WritableEntity we = arg0.f1;
+
+        for(Map.Entry<String, EntityProperty> prop : we.getProperties().entrySet()) {
+          System.err.println("key="+prop.getKey() + " ; value (asString)="+prop.getValue().getValueAsString());
+        }
+
+        return arg0.f0.toString();
+      }
+    });
+
+    // emit result (this works only locally)
+    fin.print();
+
+    // execute program
+    env.execute("Azure Example");
+  }
+}
+~~~
+
+The example shows how to access an Azure table and turn data into Flink's `DataSet` (more specifically, the type of the set is `DataSet<Tuple2<Text, WritableEntity>>`). With the `DataSet`, you can apply all known transformations to the DataSet.
+
+## Access MongoDB
+
+This [GitHub repository documents how to use MongoDB with Apache Flink (starting from 0.7-incubating)](https://github.com/okkam-it/flink-mongodb-test).
diff --git a/docs/apis/index.md b/docs/apis/index.md
index db82e6fbba548..ab12b79b99a95 100644
--- a/docs/apis/index.md
+++ b/docs/apis/index.md
@@ -18,4 +18,4 @@ software distributed under the License is distributed on an
 KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
--->
\ No newline at end of file
+-->
diff --git a/docs/apis/java8.md b/docs/apis/java8.md
index 6866b951c6575..53269e3418778 100644
--- a/docs/apis/java8.md
+++ b/docs/apis/java8.md
@@ -1,5 +1,9 @@
 ---
 title: "Java 8 Programming Guide"
+# Top-level navigation
+top-nav-group: apis
+top-nav-pos: 11
+top-nav-title: Java 8
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -20,8 +24,8 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-Java 8 introduces several new language features designed for faster and clearer coding. With the most important feature, 
-the so-called "Lambda Expressions", Java 8 opens the door to functional programming. Lambda Expressions allow for implementing and 
+Java 8 introduces several new language features designed for faster and clearer coding. With the most important feature,
+the so-called "Lambda Expressions", Java 8 opens the door to functional programming. Lambda Expressions allow for implementing and
 passing functions in a straightforward way without having to declare additional (anonymous) classes.
 
 The newest version of Flink supports the usage of Lambda Expressions for all operators of the Java API.
@@ -33,7 +37,7 @@ Flink API, please refer to the [Programming Guide](programming_guide.html)
 
 ### Examples
 
-The following example illustrates how to implement a simple, inline `map()` function that squares its input using a Lambda Expression. 
+The following example illustrates how to implement a simple, inline `map()` function that squares its input using a Lambda Expression.
 The types of input `i` and output parameters of the `map()` function need not to be declared as they are inferred by the Java 8 compiler.
 
 ~~~java
@@ -43,9 +47,9 @@ env.fromElements(1, 2, 3)
 .print();
 ~~~
 
-The next two examples show different implementations of a function that uses a `Collector` for output. 
-Functions, such as `flatMap()`, require a output type (in this case `String`) to be defined for the `Collector` in order to be type-safe. 
-If the `Collector` type can not be inferred from the surrounding context, it need to be declared in the Lambda Expression's parameter list manually. 
+The next two examples show different implementations of a function that uses a `Collector` for output.
+Functions, such as `flatMap()`, require a output type (in this case `String`) to be defined for the `Collector` in order to be type-safe.
+If the `Collector` type can not be inferred from the surrounding context, it need to be declared in the Lambda Expression's parameter list manually.
 Otherwise the output will be treated as type `Object` which can lead to undesired behaviour.
 
 ~~~java
@@ -65,7 +69,7 @@ input.flatMap((Integer number, Collector<String> out) -> {
 DataSet<String> input = env.fromElements(1, 2, 3);
 
 // collector type must not be declared, it is inferred from the type of the dataset
-DataSet<String> manyALetters = input.flatMap((number, out) -> {	
+DataSet<String> manyALetters = input.flatMap((number, out) -> {
     for(int i = 0; i < number; i++) {
         out.collect("a");
     }
@@ -79,13 +83,13 @@ The following code demonstrates a word count which makes extensive use of Lambda
 
 ~~~java
 DataSet<String> input = env.fromElements("Please count", "the words", "but not this");
-		
+
 // filter out strings that contain "not"
 input.filter(line -> !line.contains("not"))
 // split each line by space
 .map(line -> line.split(" "))
 // emit a pair <word,1> for each array element
-.flatMap((String[] wordArray, Collector<Tuple2<String, Integer>> out) 
+.flatMap((String[] wordArray, Collector<Tuple2<String, Integer>> out)
     -> Arrays.stream(wordArray).forEach(t -> out.collect(new Tuple2<>(t, 1)))
     )
 // group and sum up
@@ -95,12 +99,12 @@ input.filter(line -> !line.contains("not"))
 ~~~
 
 ### Compiler Limitations
-Currently, Flink only supports jobs containing Lambda Expressions completely if they are **compiled with the Eclipse JDT compiler contained in Eclipse Luna 4.4.2 (and above)**. 
+Currently, Flink only supports jobs containing Lambda Expressions completely if they are **compiled with the Eclipse JDT compiler contained in Eclipse Luna 4.4.2 (and above)**.
 
-Only the Eclipse JDT compiler preserves the generic type information necessary to use the entire Lambda Expressions feature type-safely. 
-Other compilers such as the OpenJDK's and Oracle JDK's `javac` throw away all generic parameters related to Lambda Expressions. This means that types such as `Tuple2<String,Integer` or `Collector<String>` declared as a Lambda function input or output parameter will be pruned to `Tuple2` or `Collector` in the compiled `.class` files, which is too little information for the Flink Compiler. 
+Only the Eclipse JDT compiler preserves the generic type information necessary to use the entire Lambda Expressions feature type-safely.
+Other compilers such as the OpenJDK's and Oracle JDK's `javac` throw away all generic parameters related to Lambda Expressions. This means that types such as `Tuple2<String,Integer` or `Collector<String>` declared as a Lambda function input or output parameter will be pruned to `Tuple2` or `Collector` in the compiled `.class` files, which is too little information for the Flink Compiler.
 
-How to compile a Flink job that contains Lambda Expressions with the JDT compiler will be covered in the next section. 
+How to compile a Flink job that contains Lambda Expressions with the JDT compiler will be covered in the next section.
 
 However, it is possible to implement functions such as `map()` or `filter()` with Lambda Expressions in Java 8 compilers other than the Eclipse JDT compiler as long as the function has no `Collector`s or `Iterable`s *and* only if the function handles unparameterized types such as `Integer`, `Long`, `String`, `MyOwnClass` (types without Generics!).
 
@@ -108,7 +112,7 @@ However, it is possible to implement functions such as `map()` or `filter()` wit
 
 If you are using the Eclipse IDE, you can run and debug your Flink code within the IDE without any problems after some configuration steps. The Eclipse IDE by default compiles its Java sources with the Eclipse JDT compiler. The next section describes how to configure the Eclipse IDE.
 
-If you are using a different IDE such as IntelliJ IDEA or you want to package your Jar-File with Maven to run your job on a cluster, you need to modify your project's `pom.xml` file and build your program with Maven. The [quickstart]({{site.baseurl}}/quickstart/setup_quickstart.html) contains preconfigured Maven projects which can be used for new projects or as a reference. Uncomment the mentioned lines in your generated quickstart `pom.xml` file if you want to use Java 8 with Lambda Expressions. 
+If you are using a different IDE such as IntelliJ IDEA or you want to package your Jar-File with Maven to run your job on a cluster, you need to modify your project's `pom.xml` file and build your program with Maven. The [quickstart]({{site.baseurl}}/quickstart/setup_quickstart.html) contains preconfigured Maven projects which can be used for new projects or as a reference. Uncomment the mentioned lines in your generated quickstart `pom.xml` file if you want to use Java 8 with Lambda Expressions.
 
 Alternatively, you can manually insert the following lines to your Maven `pom.xml` file. Maven will then use the Eclipse JDT compiler for compilation.
 
@@ -146,7 +150,7 @@ If you are using Eclipse for development, the m2e plugin might complain about th
         <versionRange>[3.1,)</versionRange>
         <goals>
             <goal>testCompile</goal>
-            <goal>compile</goal> 
+            <goal>compile</goal>
         </goals>
     </pluginExecutionFilter>
     <action>
@@ -159,7 +163,7 @@ If you are using Eclipse for development, the m2e plugin might complain about th
 
 First of all, make sure you are running a current version of Eclipse IDE (4.4.2 or later). Also make sure that you have a Java 8 Runtime Environment (JRE) installed in Eclipse IDE (`Window` -> `Preferences` -> `Java` -> `Installed JREs`).
 
-Create/Import your Eclipse project. 
+Create/Import your Eclipse project.
 
 If you are using Maven, you also need to change the Java version in your `pom.xml` for the `maven-compiler-plugin`. Otherwise right click the `JRE System Library` section of your project and open the `Properties` window in order to switch to a Java 8 JRE (or above) that supports Lambda Expressions.
 
@@ -177,7 +181,7 @@ org.eclipse.jdt.core.compiler.compliance=1.8
 org.eclipse.jdt.core.compiler.source=1.8
 ~~~
 
-After you have saved the file, perform a complete project refresh in Eclipse IDE. 
+After you have saved the file, perform a complete project refresh in Eclipse IDE.
 
 If you are using Maven, right click your Eclipse project and select `Maven` -> `Update Project...`.
 
diff --git a/docs/apis/local_execution.md b/docs/apis/local_execution.md
index dacd114e20eb2..93d8860e018c0 100644
--- a/docs/apis/local_execution.md
+++ b/docs/apis/local_execution.md
@@ -1,5 +1,8 @@
 ---
 title:  "Local Execution"
+# Top-level navigation
+top-nav-group: apis
+top-nav-pos: 7
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -99,7 +102,7 @@ Users can use algorithms implemented for batch processing also for cases that ar
 public static void main(String[] args) throws Exception {
     // initialize a new Collection-based execution environment
     final ExecutionEnvironment env = new CollectionEnvironment();
-    
+
     DataSet<User> users = env.fromCollection( /* get elements from a Java Collection */);
 
     /* Data Set transformations ... */
@@ -107,10 +110,10 @@ public static void main(String[] args) throws Exception {
     // retrieve the resulting Tuple2 elements into a ArrayList.
     Collection<...> result = new ArrayList<...>();
     resultDataSet.output(new LocalCollectionOutputFormat<...>(result));
-    
+
     // kick off execution.
     env.execute();
-    
+
     // Do some work with the resulting ArrayList (=Collection).
     for(... t : result) {
         System.err.println("Result = "+t);
diff --git a/docs/apis/scala_shell.md b/docs/apis/scala_shell.md
index a0c10c1eb9b4d..2296dca1c7645 100644
--- a/docs/apis/scala_shell.md
+++ b/docs/apis/scala_shell.md
@@ -1,5 +1,8 @@
 ---
-title: "Interactive Scala Shell"
+title: "Scala Shell"
+# Top-level navigation
+top-nav-group: apis
+top-nav-pos: 10
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -25,7 +28,7 @@ Flink comes with an integrated interactive Scala Shell.
 It can be used in a local setup as well as in a cluster setup. To get started with downloading
 Flink and setting up a cluster please refer to
 [local setup]({{ site.baseurl }}/setup/local_setup.html) or
-[cluster setup]({{ site.baseurl }}/setup/cluster.html) 
+[cluster setup]({{ site.baseurl }}/setup/cluster.html)
 
 To use the shell with an integrated Flink cluster just execute:
 
@@ -78,4 +81,3 @@ Use the parameter `-a <path/to/jar.jar>` or `--addclasspath <path/to/jar.jar>` t
 ~~~bash
 bin/start-scala-shell.sh [local | remote <host> <port>] --addclasspath <path/to/jar.jar>
 ~~~
-
diff --git a/docs/apis/streaming/connectors/docker.md b/docs/apis/streaming/connectors/docker.md
new file mode 100644
index 0000000000000..d360ef6a93efe
--- /dev/null
+++ b/docs/apis/streaming/connectors/docker.md
@@ -0,0 +1,116 @@
+---
+title: "Docker Connector"
+
+# Sub-level navigation
+sub-nav-group: streaming
+sub-nav-parent: connectors
+sub-nav-pos: 6
+sub-nav-title: Docker
+---
+
+A Docker container is provided with all the required configurations for test running the connectors of Apache Flink. The servers for the message queues will be running on the docker container while the example topology can be run on the user's computer.
+
+#### Installing Docker
+The official Docker installation guide can be found [here](https://docs.docker.com/installation/).
+After installing Docker an image can be pulled for each connector. Containers can be started from these images where all the required configurations are set.
+
+#### Creating a jar with all the dependencies
+For the easiest setup, create a jar with all the dependencies of the *flink-streaming-connectors* project.
+
+~~~bash
+cd /PATH/TO/GIT/flink/flink-staging/flink-streaming-connectors
+mvn assembly:assembly
+~~~bash
+
+This creates an assembly jar under *flink-streaming-connectors/target*.
+
+#### RabbitMQ
+Pull the docker image:
+
+~~~bash
+sudo docker pull flinkstreaming/flink-connectors-rabbitmq
+~~~
+
+To run the container, type:
+
+~~~bash
+sudo docker run -p 127.0.0.1:5672:5672 -t -i flinkstreaming/flink-connectors-rabbitmq
+~~~
+
+Now a terminal has started running from the image with all the necessary configurations to test run the RabbitMQ connector. The -p flag binds the localhost's and the Docker container's ports so RabbitMQ can communicate with the application through these.
+
+To start the RabbitMQ server:
+
+~~~bash
+sudo /etc/init.d/rabbitmq-server start
+~~~
+
+To launch the example on the host computer, execute:
+
+~~~bash
+java -cp /PATH/TO/JAR-WITH-DEPENDENCIES org.apache.flink.streaming.connectors.rabbitmq.RMQTopology \
+> log.txt 2> errorlog.txt
+~~~
+
+There are two connectors in the example. One that sends messages to RabbitMQ, and one that receives messages from the same queue. In the logger messages, the arriving messages can be observed in the following format:
+
+~~~
+<DATE> INFO rabbitmq.RMQTopology: String: <one> arrived from RMQ
+<DATE> INFO rabbitmq.RMQTopology: String: <two> arrived from RMQ
+<DATE> INFO rabbitmq.RMQTopology: String: <three> arrived from RMQ
+<DATE> INFO rabbitmq.RMQTopology: String: <four> arrived from RMQ
+<DATE> INFO rabbitmq.RMQTopology: String: <five> arrived from RMQ
+~~~
+
+#### Apache Kafka
+
+Pull the image:
+
+~~~bash
+sudo docker pull flinkstreaming/flink-connectors-kafka
+~~~
+
+To run the container type:
+
+~~~bash
+sudo docker run -p 127.0.0.1:2181:2181 -p 127.0.0.1:9092:9092 -t -i \
+flinkstreaming/flink-connectors-kafka
+~~~
+
+Now a terminal has started running from the image with all the necessary configurations to test run the Kafka connector. The -p flag binds the localhost's and the Docker container's ports so Kafka can communicate with the application through these.
+First start a zookeeper in the background:
+
+~~~bash
+/kafka_2.9.2-0.8.1.1/bin/zookeeper-server-start.sh /kafka_2.9.2-0.8.1.1/config/zookeeper.properties \
+> zookeeperlog.txt &
+~~~
+
+Then start the kafka server in the background:
+
+~~~bash
+/kafka_2.9.2-0.8.1.1/bin/kafka-server-start.sh /kafka_2.9.2-0.8.1.1/config/server.properties \
+ > serverlog.txt 2> servererr.txt &
+~~~
+
+To launch the example on the host computer execute:
+
+~~~bash
+java -cp /PATH/TO/JAR-WITH-DEPENDENCIES org.apache.flink.streaming.connectors.kafka.KafkaTopology \
+> log.txt 2> errorlog.txt
+~~~
+
+
+In the example there are two connectors. One that sends messages to Kafka, and one that receives messages from the same queue. In the logger messages, the arriving messages can be observed in the following format:
+
+~~~
+<DATE> INFO kafka.KafkaTopology: String: (0) arrived from Kafka
+<DATE> INFO kafka.KafkaTopology: String: (1) arrived from Kafka
+<DATE> INFO kafka.KafkaTopology: String: (2) arrived from Kafka
+<DATE> INFO kafka.KafkaTopology: String: (3) arrived from Kafka
+<DATE> INFO kafka.KafkaTopology: String: (4) arrived from Kafka
+<DATE> INFO kafka.KafkaTopology: String: (5) arrived from Kafka
+<DATE> INFO kafka.KafkaTopology: String: (6) arrived from Kafka
+<DATE> INFO kafka.KafkaTopology: String: (7) arrived from Kafka
+<DATE> INFO kafka.KafkaTopology: String: (8) arrived from Kafka
+<DATE> INFO kafka.KafkaTopology: String: (9) arrived from Kafka
+~~~
diff --git a/docs/apis/streaming/connectors/elasticsearch.md b/docs/apis/streaming/connectors/elasticsearch.md
new file mode 100644
index 0000000000000..5c07838fdb8db
--- /dev/null
+++ b/docs/apis/streaming/connectors/elasticsearch.md
@@ -0,0 +1,165 @@
+---
+title: "Elasticsearch Connector"
+
+# Sub-level navigation
+sub-nav-group: streaming
+sub-nav-parent: connectors
+sub-nav-pos: 2
+sub-nav-title: Elasticsearch
+---
+
+This connector provides a Sink that can write to an
+[Elasticsearch](https://elastic.co/) Index. To use this connector, add the
+following dependency to your project:
+
+{% highlight xml %}
+<dependency>
+  <groupId>org.apache.flink</groupId>
+  <artifactId>flink-connector-elasticsearch</artifactId>
+  <version>{{site.version }}</version>
+</dependency>
+{% endhighlight %}
+
+Note that the streaming connectors are currently not part of the binary
+distribution. See
+[here](cluster_execution.html#linking-with-modules-not-contained-in-the-binary-distribution)
+for information about how to package the program with the libraries for
+cluster execution.
+
+#### Installing Elasticsearch
+
+Instructions for setting up an Elasticsearch cluster can be found
+[here](https://www.elastic.co/guide/en/elasticsearch/reference/current/setup.html).
+Make sure to set and remember a cluster name. This must be set when
+creating a Sink for writing to your cluster
+
+#### Elasticsearch Sink
+The connector provides a Sink that can send data to an Elasticsearch Index.
+
+The sink can use two different methods for communicating with Elasticsearch:
+
+1. An embedded Node
+2. The TransportClient
+
+See [here](https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/client.html)
+for information about the differences between the two modes.
+
+This code shows how to create a sink that uses an embedded Node for
+communication:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<String> input = ...;
+
+Map<String, String> config = Maps.newHashMap();
+// This instructs the sink to emit after every element, otherwise they would be buffered
+config.put("bulk.flush.max.actions", "1");
+config.put("cluster.name", "my-cluster-name");
+
+input.addSink(new ElasticsearchSink<>(config, new IndexRequestBuilder<String>() {
+    @Override
+    public IndexRequest createIndexRequest(String element, RuntimeContext ctx) {
+        Map<String, Object> json = new HashMap<>();
+        json.put("data", element);
+
+        return Requests.indexRequest()
+                .index("my-index")
+                .type("my-type")
+                .source(json);
+    }
+}));
+{% endhighlight %}
+</div>
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[String] = ...
+
+val config = new util.HashMap[String, String]
+config.put("bulk.flush.max.actions", "1")
+config.put("cluster.name", "my-cluster-name")
+
+text.addSink(new ElasticsearchSink(config, new IndexRequestBuilder[String] {
+  override def createIndexRequest(element: String, ctx: RuntimeContext): IndexRequest = {
+    val json = new util.HashMap[String, AnyRef]
+    json.put("data", element)
+    println("SENDING: " + element)
+    Requests.indexRequest.index("my-index").`type`("my-type").source(json)
+  }
+}))
+{% endhighlight %}
+</div>
+</div>
+
+Note how a Map of Strings is used to configure the Sink. The configuration keys
+are documented in the Elasticsearch documentation
+[here](https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html).
+Especially important is the `cluster.name` parameter that must correspond to
+the name of your cluster.
+
+Internally, the sink uses a `BulkProcessor` to send index requests to the cluster.
+This will buffer elements before sending a request to the cluster. The behaviour of the
+`BulkProcessor` can be configured using these config keys:
+ * **bulk.flush.max.actions**: Maximum amount of elements to buffer
+ * **bulk.flush.max.size.mb**: Maximum amount of data (in megabytes) to buffer
+ * **bulk.flush.interval.ms**: Interval at which to flush data regardless of the other two
+  settings in milliseconds
+
+This example code does the same, but with a `TransportClient`:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<String> input = ...;
+
+Map<String, String> config = Maps.newHashMap();
+// This instructs the sink to emit after every element, otherwise they would be buffered
+config.put("bulk.flush.max.actions", "1");
+config.put("cluster.name", "my-cluster-name");
+
+List<TransportAddress> transports = new ArrayList<String>();
+transports.add(new InetSocketTransportAddress("node-1", 9300));
+transports.add(new InetSocketTransportAddress("node-2", 9300));
+
+input.addSink(new ElasticsearchSink<>(config, transports, new IndexRequestBuilder<String>() {
+    @Override
+    public IndexRequest createIndexRequest(String element, RuntimeContext ctx) {
+        Map<String, Object> json = new HashMap<>();
+        json.put("data", element);
+
+        return Requests.indexRequest()
+                .index("my-index")
+                .type("my-type")
+                .source(json);
+    }
+}));
+{% endhighlight %}
+</div>
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[String] = ...
+
+val config = new util.HashMap[String, String]
+config.put("bulk.flush.max.actions", "1")
+config.put("cluster.name", "my-cluster-name")
+
+val transports = new ArrayList[String]
+transports.add(new InetSocketTransportAddress("node-1", 9300))
+transports.add(new InetSocketTransportAddress("node-2", 9300))
+
+text.addSink(new ElasticsearchSink(config, transports, new IndexRequestBuilder[String] {
+  override def createIndexRequest(element: String, ctx: RuntimeContext): IndexRequest = {
+    val json = new util.HashMap[String, AnyRef]
+    json.put("data", element)
+    println("SENDING: " + element)
+    Requests.indexRequest.index("my-index").`type`("my-type").source(json)
+  }
+}))
+{% endhighlight %}
+</div>
+</div>
+
+The difference is that we now need to provide a list of Elasticsearch Nodes
+to which the sink should connect using a `TransportClient`.
+
+More about information about Elasticsearch can be found [here](https://elastic.co).
diff --git a/docs/apis/streaming/connectors/hdfs.md b/docs/apis/streaming/connectors/hdfs.md
new file mode 100644
index 0000000000000..a9df3541768e9
--- /dev/null
+++ b/docs/apis/streaming/connectors/hdfs.md
@@ -0,0 +1,115 @@
+---
+title: "HDFS Connector"
+
+# Sub-level navigation
+sub-nav-group: streaming
+sub-nav-parent: connectors
+sub-nav-pos: 3
+sub-nav-title: HDFS
+---
+
+This connector provides a Sink that writes rolling files to any filesystem supported by
+Hadoop FileSystem. To use this connector, add the
+following dependency to your project:
+
+{% highlight xml %}
+<dependency>
+  <groupId>org.apache.flink</groupId>
+  <artifactId>flink-connector-filesystem</artifactId>
+  <version>{{site.version}}</version>
+</dependency>
+{% endhighlight %}
+
+Note that the streaming connectors are currently not part of the binary
+distribution. See
+[here](cluster_execution.html#linking-with-modules-not-contained-in-the-binary-distribution)
+for information about how to package the program with the libraries for
+cluster execution.
+
+#### Rolling File Sink
+
+The rolling behaviour as well as the writing can be configured but we will get to that later.
+This is how you can create a default rolling sink:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<String> input = ...;
+
+input.addSink(new RollingSink<String>("/base/path"));
+
+{% endhighlight %}
+</div>
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[String] = ...
+
+input.addSink(new RollingSink("/base/path"))
+
+{% endhighlight %}
+</div>
+</div>
+
+The only required parameter is the base path where the rolling files (buckets) will be
+stored. The sink can be configured by specifying a custom bucketer, writer and batch size.
+
+By default the rolling sink will use the pattern `"yyyy-MM-dd--HH"` to name the rolling buckets.
+This pattern is passed to `SimpleDateFormat` with the current system time to form a bucket path. A
+new bucket will be created whenever the bucket path changes. For example, if you have a pattern
+that contains minutes as the finest granularity you will get a new bucket every minute.
+Each bucket is itself a directory that contains several part files: Each parallel instance
+of the sink will create its own part file and when part files get too big the sink will also
+create a new part file next to the others. To specify a custom bucketer use `setBucketer()`
+on a `RollingSink`.
+
+The default writer is `StringWriter`. This will call `toString()` on the incoming elements
+and write them to part files, separated by newline. To specify a custom writer use `setWriter()`
+on a `RollingSink`. If you want to write Hadoop SequenceFiles you can use the provided
+`SequenceFileWriter` which can also be configured to use compression.
+
+The last configuration option is the batch size. This specifies when a part file should be closed
+and a new one started. (The default part file size is 384 MB).
+
+Example:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<Tuple2<IntWritable,Text>> input = ...;
+
+RollingSink sink = new RollingSink<String>("/base/path");
+sink.setBucketer(new DateTimeBucketer("yyyy-MM-dd--HHmm"));
+sink.setWriter(new SequenceFileWriter<IntWritable, Text>());
+sink.setBatchSize(1024 * 1024 * 400); // this is 400 MB,
+
+input.addSink(sink);
+
+{% endhighlight %}
+</div>
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[Tuple2[IntWritable, Text]] = ...
+
+val sink = new RollingSink[String]("/base/path")
+sink.setBucketer(new DateTimeBucketer("yyyy-MM-dd--HHmm"))
+sink.setWriter(new SequenceFileWriter[IntWritable, Text]())
+sink.setBatchSize(1024 * 1024 * 400) // this is 400 MB,
+
+input.addSink(sink)
+
+{% endhighlight %}
+</div>
+</div>
+
+This will create a sink that writes to bucket files that follow this schema:
+
+```
+/base/path/{date-time}/part-{parallel-task}-{count}
+```
+
+Where `date-time` is the string that we get from the date/time format, `parallel-task` is the index
+of the parallel sink instance and `count` is the running number of part files that where created
+because of the batch size.
+
+For in-depth information, please refer to the JavaDoc for
+[RollingSink](http://flink.apache.org/docs/latest/api/java/org/apache/flink/streaming/connectors/fs/RollingSink.html).
diff --git a/docs/apis/streaming/connectors/index.md b/docs/apis/streaming/connectors/index.md
new file mode 100644
index 0000000000000..d2c04dd8223f9
--- /dev/null
+++ b/docs/apis/streaming/connectors/index.md
@@ -0,0 +1,26 @@
+---
+title: "Streaming Connectors"
+
+# Sub-level navigation
+sub-nav-group: streaming
+sub-nav-id: connectors
+sub-nav-pos: 2
+sub-nav-title: Connectors
+---
+
+Connectors provide code for interfacing with various third-party systems.
+
+Currently these systems are supported:
+
+ * [Apache Kafka](https://kafka.apache.org/) (sink/source)
+ * [Elasticsearch](https://elastic.co/) (sink)
+ * [Hadoop FileSystem](http://hadoop.apache.org) (sink)
+ * [RabbitMQ](http://www.rabbitmq.com/) (sink/source)
+ * [Twitter Streaming API](https://dev.twitter.com/docs/streaming-apis) (source)
+
+To run an application using one of these connectors, additional third party
+components are usually required to be installed and launched, e.g. the servers
+for the message queues. Further instructions for these can be found in the
+corresponding subsections. [Docker containers](#docker-containers-for-connectors)
+are also provided encapsulating these services to aid users getting started
+with connectors.
\ No newline at end of file
diff --git a/docs/apis/streaming/connectors/kafka.md b/docs/apis/streaming/connectors/kafka.md
new file mode 100644
index 0000000000000..15fe79ec45c98
--- /dev/null
+++ b/docs/apis/streaming/connectors/kafka.md
@@ -0,0 +1,160 @@
+---
+title: "Apache Kafka Connector"
+
+# Sub-level navigation
+sub-nav-group: streaming
+sub-nav-parent: connectors
+sub-nav-pos: 1
+sub-nav-title: Kafka
+---
+
+This connector provides access to event streams served by [Apache Kafka](https://kafka.apache.org/).
+
+Flink provides special Kafka Connectors for reading and writing data from/to Kafka topics.
+The Flink Kafka Consumer integrates with Flink's checkpointing mechanism to provide
+exactly-once processing semantics. To achieve that, Flink does not purely rely on Kafka's consumer group
+offset tracking, but tracks and checkpoints these offsets internally as well.
+
+Please pick a package (maven artifact id) and class name for your use-case and environment.
+For most users, the `FlinkKafkaConsumer082` (part of `flink-connector-kafka`) is appropriate.
+
+<table class="table table-bordered">
+  <thead>
+    <tr>
+      <th class="text-left">Maven Dependency</th>
+      <th class="text-left">Supported since</th>
+      <th class="text-left">Class name</th>
+      <th class="text-left">Kafka version</th>
+      <th class="text-left">Notes</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+        <td>flink-connector-kafka</td>
+        <td>0.9.1, 0.10</td>
+        <td>FlinkKafkaConsumer081</td>
+        <td>0.8.1</td>
+        <td>Uses the <a href="https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example">SimpleConsumer</a> API of Kafka internally. Offsets are committed to ZK by Flink.</td>
+    </tr>
+    <tr>
+        <td>flink-connector-kafka</td>
+        <td>0.9.1, 0.10</td>
+        <td>FlinkKafkaConsumer082</td>
+        <td>0.8.2</td>
+        <td>Uses the <a href="https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example">SimpleConsumer</a> API of Kafka internally. Offsets are committed to ZK by Flink.</td>
+    </tr>
+  </tbody>
+</table>
+
+Then, import the connector in your maven project:
+
+{% highlight xml %}
+<dependency>
+  <groupId>org.apache.flink</groupId>
+  <artifactId>flink-connector-kafka</artifactId>
+  <version>{{site.version }}</version>
+</dependency>
+{% endhighlight %}
+
+Note that the streaming connectors are currently not part of the binary distribution. See how to link with them for cluster execution [here](cluster_execution.html#linking-with-modules-not-contained-in-the-binary-distribution).
+
+#### Installing Apache Kafka
+
+* Follow the instructions from [Kafka's quickstart](https://kafka.apache.org/documentation.html#quickstart) to download the code and launch a server (launching a Zookeeper and a Kafka server is required every time before starting the application).
+* On 32 bit computers [this](http://stackoverflow.com/questions/22325364/unrecognized-vm-option-usecompressedoops-when-running-kafka-from-my-ubuntu-in) problem may occur.
+* If the Kafka and Zookeeper servers are running on a remote machine, then the `advertised.host.name` setting in the `config/server.properties` file must be set to the machine's IP address.
+
+#### Kafka Consumer
+
+The standard `FlinkKafkaConsumer082` is a Kafka consumer providing access to one topic. It takes the following parameters to the constructor:
+
+1. The topic name
+2. A DeserializationSchema
+3. Properties for the Kafka consumer.
+  The following properties are required:
+  - "bootstrap.servers" (comma separated list of Kafka brokers)
+  - "zookeeper.connect" (comma separated list of Zookeeper servers)
+  - "group.id" the id of the consumer group
+
+Example:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+Properties properties = new Properties();
+properties.setProperty("bootstrap.servers", "localhost:9092");
+properties.setProperty("zookeeper.connect", "localhost:2181");
+properties.setProperty("group.id", "test");
+DataStream<String> stream = env
+	.addSource(new FlinkKafkaConsumer082<>("topic", new SimpleStringSchema(), properties))
+	.print();
+{% endhighlight %}
+</div>
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val properties = new Properties();
+properties.setProperty("bootstrap.servers", "localhost:9092");
+properties.setProperty("zookeeper.connect", "localhost:2181");
+properties.setProperty("group.id", "test");
+stream = env
+    .addSource(new FlinkKafkaConsumer082[String]("topic", new SimpleStringSchema(), properties))
+    .print
+{% endhighlight %}
+</div>
+</div>
+
+#### Kafka Consumers and Fault Tolerance
+
+With Flink's checkpointing enabled, the Flink Kafka Consumer will consume records from a topic and periodically checkpoint all
+its Kafka offsets, together with the state of other operations, in a consistent manner. In case of a job failure, Flink will restore
+the streaming program to the state of the latest checkpoint and re-consume the records from Kafka, starting from the offsets that where
+stored in the checkpoint.
+
+The interval of drawing checkpoints therefore defines how much the program may have to go back at most, in case of a failure.
+
+To use fault tolerant Kafka Consumers, checkpointing of the topology needs to be enabled at the execution environment:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
+env.enableCheckpointing(5000); // checkpoint every 5000 msecs
+{% endhighlight %}
+</div>
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val env = StreamExecutionEnvironment.getExecutionEnvironment()
+env.enableCheckpointing(5000) // checkpoint every 5000 msecs
+{% endhighlight %}
+</div>
+</div>
+
+Also note that Flink can only restart the topology if enough processing slots are available to restart the topology.
+So if the topology fails due to loss of a TaskManager, there must still be enough slots available afterwards.
+Flink on YARN supports automatic restart of lost YARN containers.
+
+If checkpointing is not enabled, the Kafka consumer will periodically commit the offsets to Zookeeper.
+
+#### Kafka Producer
+
+The `FlinkKafkaProducer` writes data to a Kafka topic. The producer can specify a custom partitioner that assigns
+recors to partitions.
+
+Example:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+stream.addSink(new FlinkKafkaProducer<String>("localhost:9092", "my-topic", new SimpleStringSchema()));
+{% endhighlight %}
+</div>
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+stream.addSink(new FlinkKafkaProducer[String]("localhost:9092", "my-topic", new SimpleStringSchema()))
+{% endhighlight %}
+</div>
+</div>
+
+You can also define a custom Kafka producer configuration for the KafkaSink with the constructor. Please refer to
+the [Apache Kafka documentation](https://kafka.apache.org/documentation.html) for details on how to configure
+Kafka Producers.
\ No newline at end of file
diff --git a/docs/apis/streaming/connectors/rabbitmq.md b/docs/apis/streaming/connectors/rabbitmq.md
new file mode 100644
index 0000000000000..3d36a3532ad00
--- /dev/null
+++ b/docs/apis/streaming/connectors/rabbitmq.md
@@ -0,0 +1,102 @@
+---
+title: "RabbitMQ Connector"
+
+# Sub-level navigation
+sub-nav-group: streaming
+sub-nav-parent: connectors
+sub-nav-pos: 4
+sub-nav-title: RabbitMQ
+---
+
+This connector provides access to data streams from [RabbitMQ](http://www.rabbitmq.com/). To use this connector, add the following dependency to your project:
+
+{% highlight xml %}
+<dependency>
+  <groupId>org.apache.flink</groupId>
+  <artifactId>flink-connector-rabbitmq</artifactId>
+  <version>{{site.version }}</version>
+</dependency>
+{% endhighlight %}
+
+Note that the streaming connectors are currently not part of the binary distribution. See linking with them for cluster execution [here](cluster_execution.html#linking-with-modules-not-contained-in-the-binary-distribution).
+
+#### Installing RabbitMQ
+Follow the instructions from the [RabbitMQ download page](http://www.rabbitmq.com/download.html). After the installation the server automatically starts, and the application connecting to RabbitMQ can be launched.
+
+#### RabbitMQ Source
+
+A class which provides an interface for receiving data from RabbitMQ.
+
+The followings have to be provided for the `RMQSource(…)` constructor in order:
+
+- hostName: The RabbitMQ broker hostname.
+- queueName: The RabbitMQ queue name.
+- usesCorrelationId: `true` when correlation ids should be used, `false` otherwise (default is `false`).
+- deserializationScehma: Deserialization schema to turn messages into Java objects.
+
+This source can be operated in three different modes:
+
+1. Exactly-once (when checkpointed) with RabbitMQ transactions and messages with
+    unique correlation IDs.
+2. At-least-once (when checkpointed) with RabbitMQ transactions but no deduplication mechanism
+    (correlation id is not set).
+3. No strong delivery guarantees (without checkpointing) with RabbitMQ auto-commit mode.
+
+Correlation ids are a RabbitMQ application feature. You have to set it in the message properties
+when injecting messages into RabbitMQ. If you set `usesCorrelationId` to true and do not supply
+unique correlation ids, the source will throw an exception (if the correlation id is null) or ignore
+messages with non-unique correlation ids. If you set `usesCorrelationId` to false, then you don't
+have to supply correlation ids.
+
+Example:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<String> streamWithoutCorrelationIds = env
+	.addSource(new RMQSource<String>("localhost", "hello", new SimpleStringSchema()))
+	.print
+
+DataStream<String> streamWithCorrelationIds = env
+	.addSource(new RMQSource<String>("localhost", "hello", true, new SimpleStringSchema()))
+	.print
+{% endhighlight %}
+</div>
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+streamWithoutCorrelationIds = env
+    .addSource(new RMQSource[String]("localhost", "hello", new SimpleStringSchema))
+    .print
+
+streamWithCorrelationIds = env
+    .addSource(new RMQSource[String]("localhost", "hello", true, new SimpleStringSchema))
+    .print
+{% endhighlight %}
+</div>
+</div>
+
+#### RabbitMQ Sink
+A class providing an interface for sending data to RabbitMQ.
+
+The followings have to be provided for the `RMQSink(…)` constructor in order:
+
+1. The hostname
+2. The queue name
+3. Serialization schema
+
+Example:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+stream.addSink(new RMQSink<String>("localhost", "hello", new StringToByteSerializer()));
+{% endhighlight %}
+</div>
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+stream.addSink(new RMQSink[String]("localhost", "hello", new StringToByteSerializer))
+{% endhighlight %}
+</div>
+</div>
+
+More about RabbitMQ can be found [here](http://www.rabbitmq.com/).
\ No newline at end of file
diff --git a/docs/apis/streaming/connectors/twitter.md b/docs/apis/streaming/connectors/twitter.md
new file mode 100644
index 0000000000000..4ab85f71edcbc
--- /dev/null
+++ b/docs/apis/streaming/connectors/twitter.md
@@ -0,0 +1,89 @@
+---
+title: "Twitter Connector"
+
+# Sub-level navigation
+sub-nav-group: streaming
+sub-nav-parent: connectors
+sub-nav-pos: 5
+sub-nav-title: Twitter
+---
+
+Twitter Streaming API provides opportunity to connect to the stream of tweets made available by Twitter. Flink Streaming comes with a built-in `TwitterSource` class for establishing a connection to this stream. To use this connector, add the following dependency to your project:
+
+{% highlight xml %}
+<dependency>
+  <groupId>org.apache.flink</groupId>
+  <artifactId>flink-connector-twitter</artifactId>
+  <version>{{site.version }}</version>
+</dependency>
+{% endhighlight %}
+
+Note that the streaming connectors are currently not part of the binary distribution. See linking with them for cluster execution [here](cluster_execution.html#linking-with-modules-not-contained-in-the-binary-distribution).
+
+#### Authentication
+In order to connect to Twitter stream the user has to register their program and acquire the necessary information for the authentication. The process is described below.
+
+#### Acquiring the authentication information
+First of all, a Twitter account is needed. Sign up for free at [twitter.com/signup](https://twitter.com/signup) or sign in at Twitter's [Application Management](https://apps.twitter.com/) and register the application by clicking on the "Create New App" button. Fill out a form about your program and accept the Terms and Conditions.
+After selecting the application, the API key and API secret (called `consumerKey` and `consumerSecret` in `TwitterSource` respectively) is located on the "API Keys" tab. The necessary OAuth Access Token data (`token` and `secret` in `TwitterSource`) can be generated and acquired on the "Keys and Access Tokens" tab.
+Remember to keep these pieces of information secret and do not push them to public repositories.
+
+#### Accessing the authentication information
+Create a properties file, and pass its path in the constructor of `TwitterSource`. The content of the file should be similar to this:
+
+~~~bash
+#properties file for my app
+secret=***
+consumerSecret=***
+token=***-***
+consumerKey=***
+~~~
+
+#### Constructors
+The `TwitterSource` class has two constructors.
+
+1. `public TwitterSource(String authPath, int numberOfTweets);`
+to emit a finite number of tweets
+2. `public TwitterSource(String authPath);`
+for streaming
+
+Both constructors expect a `String authPath` argument determining the location of the properties file containing the authentication information. In the first case, `numberOfTweets` determines how many tweet the source emits.
+
+#### Usage
+In contrast to other connectors, the `TwitterSource` depends on no additional services. For example the following code should run gracefully:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<String> streamSource = env.addSource(new TwitterSource("/PATH/TO/myFile.properties"));
+{% endhighlight %}
+</div>
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+streamSource = env.addSource(new TwitterSource("/PATH/TO/myFile.properties"))
+{% endhighlight %}
+</div>
+</div>
+
+The `TwitterSource` emits strings containing a JSON code.
+To retrieve information from the JSON code you can add a FlatMap or a Map function handling JSON code. For example, there is an implementation `JSONParseFlatMap` abstract class among the examples. `JSONParseFlatMap` is an extension of the `FlatMapFunction` and has a
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+String getField(String jsonText, String field);
+{% endhighlight %}
+</div>
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+getField(jsonText : String, field : String) : String
+{% endhighlight %}
+</div>
+</div>
+
+function which can be use to acquire the value of a given field.
+
+There are two basic types of tweets. The usual tweets contain information such as date and time of creation, id, user, language and many more details. The other type is the delete information.
+
+#### Example
+`TwitterStream` is an example of how to use `TwitterSource`. It implements a language frequency counter program.
diff --git a/docs/apis/fault_tolerance.md b/docs/apis/streaming/fault_tolerance.md
similarity index 74%
rename from docs/apis/fault_tolerance.md
rename to docs/apis/streaming/fault_tolerance.md
index 677ff95c7cde9..6b54bd9113fbd 100644
--- a/docs/apis/fault_tolerance.md
+++ b/docs/apis/streaming/fault_tolerance.md
@@ -1,6 +1,10 @@
 ---
 title: "Fault Tolerance"
 is_beta: false
+
+sub-nav-group: streaming
+sub-nav-id: fault_tolerance
+sub-nav-pos: 3
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -21,8 +25,6 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-<a href="#top"></a>
-
 Flink's fault tolerance mechanism recovers programs in the presence of failures and
 continues to execute them. Such failures include machine hardware failures, network failures,
 transient program failures, etc.
@@ -99,7 +101,7 @@ env.getCheckpointConfig.setMaxConcurrentCheckpoints(1)
 
 ### Fault Tolerance Guarantees of Data Sources and Sinks
 
-Flink can guarantee exactly-once state updates to user-defined state only when the source participates in the 
+Flink can guarantee exactly-once state updates to user-defined state only when the source participates in the
 snapshotting mechanism. This is currently guaranteed for the Kafka source (and internal number generators), but
 not for other sources. The following table lists the state update guarantees of Flink coupled with the bundled sources:
 
@@ -146,7 +148,7 @@ not for other sources. The following table lists the state update guarantees of
 </table>
 
 To guarantee end-to-end exactly-once record delivery (in addition to exactly-once state semantics), the data sink needs
-to take part in the checkpointing mechanism. The following table lists the delivery guarantees (assuming exactly-once 
+to take part in the checkpointing mechanism. The following table lists the delivery guarantees (assuming exactly-once
 state updates) of Flink coupled with bundled sinks:
 
 <table class="table table-bordered">
@@ -191,75 +193,4 @@ state updates) of Flink coupled with bundled sinks:
   </tbody>
 </table>
 
-[Back to top](#top)
-
-
-Batch Processing Fault Tolerance (DataSet API)
-----------------------------------------------
-
-Fault tolerance for programs in the *DataSet API* works by retrying failed executions.
-The number of time that Flink retries the execution before the job is declared as failed is configurable
-via the *execution retries* parameter. A value of *0* effectively means that fault tolerance is deactivated.
-
-To activate the fault tolerance, set the *execution retries* to a value larger than zero. A common choice is a value
-of three.
-
-This example shows how to configure the execution retries for a Flink DataSet program.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
-env.setNumberOfExecutionRetries(3);
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val env = ExecutionEnvironment.getExecutionEnvironment()
-env.setNumberOfExecutionRetries(3)
-{% endhighlight %}
-</div>
-</div>
-
-
-You can also define default values for the number of execution retries and the retry delay in the `flink-conf.yaml`:
-
-~~~
-execution-retries.default: 3
-~~~
-
-
-Retry Delays
-------------
-
-Execution retries can be configured to be delayed. Delaying the retry means that after a failed execution, the re-execution does not start
-immediately, but only after a certain delay.
-
-Delaying the retries can be helpful when the program interacts with external systems where for example connections or pending transactions should reach a timeout before re-execution is attempted.
-
-You can set the retry delay for each program as follows (the sample shows the DataStream API - the DataSet API works similarly):
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
-env.getConfig().setExecutionRetryDelay(5000); // 5000 milliseconds delay
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val env = StreamExecutionEnvironment.getExecutionEnvironment()
-env.getConfig.setExecutionRetryDelay(5000) // 5000 milliseconds delay
-{% endhighlight %}
-</div>
-</div>
-
-You can also define the default value for the retry delay in the `flink-conf.yaml`:
-
-~~~
-execution-retries.delay: 10 s
-~~~
-
-[Back to top](#top)
-
-
+{% top %}
diff --git a/docs/apis/streaming/fig/LICENSE.txt b/docs/apis/streaming/fig/LICENSE.txt
new file mode 100644
index 0000000000000..35b867379a1e7
--- /dev/null
+++ b/docs/apis/streaming/fig/LICENSE.txt
@@ -0,0 +1,17 @@
+All image files in the folder and its subfolders are
+licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
\ No newline at end of file
diff --git a/docs/apis/fig/savepoints-overview.png b/docs/apis/streaming/fig/savepoints-overview.png
similarity index 100%
rename from docs/apis/fig/savepoints-overview.png
rename to docs/apis/streaming/fig/savepoints-overview.png
diff --git a/docs/apis/fig/savepoints-program_ids.png b/docs/apis/streaming/fig/savepoints-program_ids.png
similarity index 100%
rename from docs/apis/fig/savepoints-program_ids.png
rename to docs/apis/streaming/fig/savepoints-program_ids.png
diff --git a/docs/apis/streaming_guide.md b/docs/apis/streaming/index.md
similarity index 79%
rename from docs/apis/streaming_guide.md
rename to docs/apis/streaming/index.md
index a1d4d372a30de..06c0014c99b3d 100644
--- a/docs/apis/streaming_guide.md
+++ b/docs/apis/streaming/index.md
@@ -1,6 +1,16 @@
 ---
 title: "Flink DataStream API Programming Guide"
-is_beta: false
+
+# Top-level navigation
+top-nav-group: apis
+top-nav-pos: 1
+top-nav-title: <strong>Streaming Guide</strong> (DataStream API)
+
+# Sub-level navigation
+sub-nav-group: streaming
+sub-nav-group-title: Streaming Guide
+sub-nav-pos: 1
+sub-nav-title: DataStream API
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -21,8 +31,6 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-<a href="#top"></a>
-
 DataStream programs in Flink are regular programs that implement transformations on data streams
 (e.g., filtering, updating state, defining windows, aggregating). The data streams are initially created from various
 sources (e.g., message queues, socket streams, files). Results are returned via sinks, which may for
@@ -128,7 +136,7 @@ Just type some words hitting return for a new word. These will be the input to t
 word count program. If you want to see counts greater than 1, type the same word again and again within
 5 seconds (increase the window size from 5 seconds if you cannot type that fast &#9786;).
 
-[Back to top](#top)
+{% top %}
 
 
 Linking with Flink
@@ -201,7 +209,7 @@ In order to create your own Flink program, we encourage you to start with the
 [program skeleton](#program-skeleton) and gradually add your own
 [transformations](#transformations).
 
-[Back to top](#top)
+{% top %}
 
 Program Skeleton
 ----------------
@@ -241,8 +249,8 @@ Typically, you only need to use `getExecutionEnvironment()`, since this
 will do the right thing depending on the context: if you are executing
 your program inside an IDE or as a regular Java program it will create
 a local environment that will execute your program on your local machine. If
-you created a JAR file from your program, and invoke it through the [command line](cli.html)
-or the [web interface](web_client.html),
+you created a JAR file from your program, and invoke it through the [command line]({{ site.baseurl }}/apis/cli.html)
+or the [web interface]({{ site.baseurl }}/apis/web_client.html),
 the Flink cluster manager will execute your main method and `getExecutionEnvironment()` will return
 an execution environment for executing your program on a cluster.
 
@@ -398,7 +406,7 @@ env.execute()
 </div>
 </div>
 
-[Back to top](#top)
+{% top %}
 
 DataStream Abstraction
 ----------------------
@@ -409,7 +417,7 @@ Transformations may return different subtypes of `DataStream` allowing specializ
 For example the `keyBy(…)` method returns a `KeyedDataStream` which is a stream of data that
 is logically partitioned by a certain key, and can be further windowed.
 
-[Back to top](#top)
+{% top %}
 
 Lazy Evaluation
 ---------------
@@ -423,7 +431,7 @@ or on a cluster depends on the type of `StreamExecutionEnvironment`.
 The lazy evaluation lets you construct sophisticated programs that Flink executes as one
 holistically planned unit.
 
-[Back to top](#top)
+{% top %}
 
 
 Transformations
@@ -1521,7 +1529,7 @@ someStream.map(...).isolateResources()
 </div>
 
 
-[Back to top](#top)
+{% top %}
 
 Specifying Keys
 ----------------
@@ -1542,7 +1550,7 @@ you do not need to physically pack the data stream types into keys and
 values. Keys are "virtual": they are defined as functions over the
 actual data to guide the grouping operator.
 
-See [the relevant section of the DataSet API documentation](programming_guide.html#specifying-keys) on how to specify keys.
+See [the relevant section of the DataSet API documentation]({{ site.baseurl }}/apis/batch/index.html#specifying-keys) on how to specify keys.
 Just replace `DataSet` with `DataStream`, and `groupBy` with `keyBy`.
 
 
@@ -1552,10 +1560,10 @@ Passing Functions to Flink
 
 Some transformations take user-defined functions as arguments.
 
-See [the relevant section of the DataSet API documentation](programming_guide.html#passing-functions-to-flink).
+See [the relevant section of the DataSet API documentation]({{ site.baseurl }}/apis/batch/index.html#passing-functions-to-flink).
 
 
-[Back to top](#top)
+{% top %}
 
 
 Data Types
@@ -1565,9 +1573,9 @@ Flink places some restrictions on the type of elements that are used in DataStre
 of transformations. The reason for this is that the system analyzes the types to determine
 efficient execution strategies.
 
-See [the relevant section of the DataSet API documentation](programming_guide.html#data-types).
+See [the relevant section of the DataSet API documentation]({{ site.baseurl }}/apis/batch/index.html#data-types).
 
-[Back to top](#top)
+{% top %}
 
 
 Data Sources
@@ -1622,7 +1630,7 @@ Collection-based:
 Custom:
 
 - `addSource` - Attache a new source function. For example, to read from Apache Kafka you can use
-    `addSource(new FlinkKafkaConsumer082<>(...))`. See [connectors](#connectors) for more details.
+    `addSource(new FlinkKafkaConsumer082<>(...))`. See [connectors]({{ site.baseurl }}/apis/streaming/connectors/) for more details.
 
 </div>
 
@@ -1674,12 +1682,12 @@ Collection-based:
 Custom:
 
 - `addSource` - Attache a new source function. For example, to read from Apache Kafka you can use
-    `addSource(new FlinkKafkaConsumer082<>(...))`. See [connectors](#connectors) for more details.
+    `addSource(new FlinkKafkaConsumer082<>(...))`. See [connectors]({{ site.baseurl }}/apis/streaming/connectors/) for more details.
 
 </div>
 </div>
 
-[Back to top](#top)
+{% top %}
 
 
 Execution Configuration
@@ -1687,7 +1695,7 @@ Execution Configuration
 
 The `StreamExecutionEnvironment` also contains the `ExecutionConfig` which allows to set job specific configuration values for the runtime.
 
-See [the relevant section of the DataSet API documentation](programming_guide.html#execution-configuration).
+See [the relevant section of the DataSet API documentation]({{ site.baseurl }}/apis/batch/index.html#execution-configuration).
 
 Parameters in the `ExecutionConfig` that pertain specifically to the DataStream API are:
 
@@ -1697,7 +1705,7 @@ Parameters in the `ExecutionConfig` that pertain specifically to the DataStream
 - `setAutoWatermarkInterval(long milliseconds)`: Set the interval for automatic watermark emission. You can
     get the current value with `long getAutoWatermarkInterval()`
 
-[Back to top](#top)
+{% top %}
 
 Data Sinks
 ----------
@@ -1762,7 +1770,7 @@ greater than 1, the output will also be prepended with the identifier of the tas
 </div>
 
 
-[Back to top](#top)
+{% top %}
 
 Debugging
 ---------
@@ -1881,7 +1889,7 @@ val myOutput: Iterator[(String, Int)] = DataStreamUtils.collect(myResult.getJava
 </div>
 
 
-[Back to top](#top)
+{% top %}
 
 
 Windows
@@ -2937,14 +2945,14 @@ nonKeyedStream.countWindowAll(1000, 100)
 </div>
 </div>
 
-[Back to top](#top)
+{% top %}
 
 Execution Parameters
 --------------------
 
 ### Fault Tolerance
 
-The [Fault Tolerance Documentation]({{ site.baseurl }}/apis/fault_tolerance.html) describes the options and parameters to enable and configure Flink's checkpointing mechanism.
+The [Fault Tolerance Documentation](fault_tolerance.html) describes the options and parameters to enable and configure Flink's checkpointing mechanism.
 
 ### Parallelism
 
@@ -2985,7 +2993,7 @@ To maximize throughput, set `setBufferTimeout(-1)` which will remove the timeout
 flushed when they are full. To minimize latency, set the timeout to a value close to 0 (for example 5 or 10 ms).
 A buffer timeout of 0 should be avoided, because it can cause severe performance degradation.
 
-[Back to top](#top)
+{% top %}
 
 Working with State
 ------------------
@@ -3161,7 +3169,7 @@ Flink currently only provides processing guarantees for jobs without iterations.
 
 Please note that records in flight in the loop edges (and the state changes associated with them) will be lost during failure.
 
-[Back to top](#top)
+{% top %}
 
 Iterations
 ----------
@@ -3274,765 +3282,25 @@ val iteratedStream = someIntegers.iterate(
 </div>
 </div>
 
-[Back to top](#top)
-
-Connectors
-----------
-
-<!-- TODO: reintroduce flume -->
-Connectors provide code for interfacing with various third-party systems.
-
-Currently these systems are supported:
-
- * [Apache Kafka](https://kafka.apache.org/) (sink/source)
- * [Elasticsearch](https://elastic.co/) (sink)
- * [Hadoop FileSystem](http://hadoop.apache.org) (sink)
- * [RabbitMQ](http://www.rabbitmq.com/) (sink/source)
- * [Twitter Streaming API](https://dev.twitter.com/docs/streaming-apis) (source)
-
-To run an application using one of these connectors, additional third party
-components are usually required to be installed and launched, e.g. the servers
-for the message queues. Further instructions for these can be found in the
-corresponding subsections. [Docker containers](#docker-containers-for-connectors)
-are also provided encapsulating these services to aid users getting started
-with connectors.
-
-### Apache Kafka
-
-This connector provides access to event streams served by [Apache Kafka](https://kafka.apache.org/).
-
-Flink provides special Kafka Connectors for reading and writing data from/to Kafka topics.
-The Flink Kafka Consumer integrates with Flink's checkpointing mechanism to provide
-exactly-once processing semantics. To achieve that, Flink does not purely rely on Kafka's consumer group
-offset tracking, but tracks and checkpoints these offsets internally as well.
-
-Please pick a package (maven artifact id) and class name for your use-case and environment.
-For most users, the `FlinkKafkaConsumer082` (part of `flink-connector-kafka`) is appropriate.
-
-
-<table class="table table-bordered">
-  <thead>
-    <tr>
-      <th class="text-left">Maven Dependency</th>
-      <th class="text-left">Supported since</th>
-      <th class="text-left">Class name</th>
-      <th class="text-left">Kafka version</th>
-      <th class="text-left">Notes</th>
-    </tr>
-  </thead>
-  <tbody>
-    <tr>
-        <td>flink-connector-kafka</td>
-        <td>0.9.1, 0.10</td>
-        <td>FlinkKafkaConsumer081</td>
-        <td>0.8.1</td>
-        <td>Uses the <a href="https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example">SimpleConsumer</a> API of Kafka internally. Offsets are committed to ZK by Flink.</td>
-    </tr>
-    <tr>
-        <td>flink-connector-kafka</td>
-        <td>0.9.1, 0.10</td>
-        <td>FlinkKafkaConsumer082</td>
-        <td>0.8.2</td>
-        <td>Uses the <a href="https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example">SimpleConsumer</a> API of Kafka internally. Offsets are committed to ZK by Flink.</td>
-    </tr>
-  </tbody>
-</table>
-
-Then, import the connector in your maven project:
-
-{% highlight xml %}
-<dependency>
-  <groupId>org.apache.flink</groupId>
-  <artifactId>flink-connector-kafka</artifactId>
-  <version>{{site.version }}</version>
-</dependency>
-{% endhighlight %}
-
-Note that the streaming connectors are currently not part of the binary distribution. See how to link with them for cluster execution [here](cluster_execution.html#linking-with-modules-not-contained-in-the-binary-distribution).
-
-#### Installing Apache Kafka
-
-* Follow the instructions from [Kafka's quickstart](https://kafka.apache.org/documentation.html#quickstart) to download the code and launch a server (launching a Zookeeper and a Kafka server is required every time before starting the application).
-* On 32 bit computers [this](http://stackoverflow.com/questions/22325364/unrecognized-vm-option-usecompressedoops-when-running-kafka-from-my-ubuntu-in) problem may occur.
-* If the Kafka and Zookeeper servers are running on a remote machine, then the `advertised.host.name` setting in the `config/server.properties` file must be set to the machine's IP address.
-
-#### Kafka Consumer
-
-The standard `FlinkKafkaConsumer082` is a Kafka consumer providing access to one topic. It takes the following parameters to the constructor:
-
-1. The topic name
-2. A DeserializationSchema
-3. Properties for the Kafka consumer.
-  The following properties are required:
-  - "bootstrap.servers" (comma separated list of Kafka brokers)
-  - "zookeeper.connect" (comma separated list of Zookeeper servers)
-  - "group.id" the id of the consumer group
-
-Example:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-Properties properties = new Properties();
-properties.setProperty("bootstrap.servers", "localhost:9092");
-properties.setProperty("zookeeper.connect", "localhost:2181");
-properties.setProperty("group.id", "test");
-DataStream<String> stream = env
-	.addSource(new FlinkKafkaConsumer082<>("topic", new SimpleStringSchema(), properties))
-	.print();
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val properties = new Properties();
-properties.setProperty("bootstrap.servers", "localhost:9092");
-properties.setProperty("zookeeper.connect", "localhost:2181");
-properties.setProperty("group.id", "test");
-stream = env
-    .addSource(new FlinkKafkaConsumer082[String]("topic", new SimpleStringSchema(), properties))
-    .print
-{% endhighlight %}
-</div>
-</div>
-
-#### Kafka Consumers and Fault Tolerance
-
-With Flink's checkpointing enabled, the Flink Kafka Consumer will consume records from a topic and periodically checkpoint all
-its Kafka offsets, together with the state of other operations, in a consistent manner. In case of a job failure, Flink will restore
-the streaming program to the state of the latest checkpoint and re-consume the records from Kafka, starting from the offsets that where
-stored in the checkpoint.
-
-The interval of drawing checkpoints therefore defines how much the program may have to go back at most, in case of a failure.
-
-To use fault tolerant Kafka Consumers, checkpointing of the topology needs to be enabled at the execution environment:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
-env.enableCheckpointing(5000); // checkpoint every 5000 msecs
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val env = StreamExecutionEnvironment.getExecutionEnvironment()
-env.enableCheckpointing(5000) // checkpoint every 5000 msecs
-{% endhighlight %}
-</div>
-</div>
-
-Also note that Flink can only restart the topology if enough processing slots are available to restart the topology.
-So if the topology fails due to loss of a TaskManager, there must still be enough slots available afterwards.
-Flink on YARN supports automatic restart of lost YARN containers.
-
-If checkpointing is not enabled, the Kafka consumer will periodically commit the offsets to Zookeeper.
-
-#### Kafka Producer
-
-The `FlinkKafkaProducer` writes data to a Kafka topic. The producer can specify a custom partitioner that assigns
-recors to partitions.
-
-Example:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-stream.addSink(new FlinkKafkaProducer<String>("localhost:9092", "my-topic", new SimpleStringSchema()));
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-stream.addSink(new FlinkKafkaProducer[String]("localhost:9092", "my-topic", new SimpleStringSchema()))
-{% endhighlight %}
-</div>
-</div>
-
-You can also define a custom Kafka producer configuration for the KafkaSink with the constructor. Please refer to
-the [Apache Kafka documentation](https://kafka.apache.org/documentation.html) for details on how to configure
-Kafka Producers.
-
-[Back to top](#top)
-
-### Elasticsearch
-
-This connector provides a Sink that can write to an
-[Elasticsearch](https://elastic.co/) Index. To use this connector, add the
-following dependency to your project:
-
-{% highlight xml %}
-<dependency>
-  <groupId>org.apache.flink</groupId>
-  <artifactId>flink-connector-elasticsearch</artifactId>
-  <version>{{site.version }}</version>
-</dependency>
-{% endhighlight %}
-
-Note that the streaming connectors are currently not part of the binary
-distribution. See
-[here](cluster_execution.html#linking-with-modules-not-contained-in-the-binary-distribution)
-for information about how to package the program with the libraries for
-cluster execution.
-
-#### Installing Elasticsearch
-
-Instructions for setting up an Elasticsearch cluster can be found
-[here](https://www.elastic.co/guide/en/elasticsearch/reference/current/setup.html).
-Make sure to set and remember a cluster name. This must be set when
-creating a Sink for writing to your cluster
-
-#### Elasticsearch Sink
-The connector provides a Sink that can send data to an Elasticsearch Index.
-
-The sink can use two different methods for communicating with Elasticsearch:
-
-1. An embedded Node
-2. The TransportClient
-
-See [here](https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/client.html)
-for information about the differences between the two modes.
-
-This code shows how to create a sink that uses an embedded Node for
-communication:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<String> input = ...;
-
-Map<String, String> config = Maps.newHashMap();
-// This instructs the sink to emit after every element, otherwise they would be buffered
-config.put("bulk.flush.max.actions", "1");
-config.put("cluster.name", "my-cluster-name");
-
-input.addSink(new ElasticsearchSink<>(config, new IndexRequestBuilder<String>() {
-    @Override
-    public IndexRequest createIndexRequest(String element, RuntimeContext ctx) {
-        Map<String, Object> json = new HashMap<>();
-        json.put("data", element);
-
-        return Requests.indexRequest()
-                .index("my-index")
-                .type("my-type")
-                .source(json);
-    }
-}));
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[String] = ...
-
-val config = new util.HashMap[String, String]
-config.put("bulk.flush.max.actions", "1")
-config.put("cluster.name", "my-cluster-name")
-
-text.addSink(new ElasticsearchSink(config, new IndexRequestBuilder[String] {
-  override def createIndexRequest(element: String, ctx: RuntimeContext): IndexRequest = {
-    val json = new util.HashMap[String, AnyRef]
-    json.put("data", element)
-    println("SENDING: " + element)
-    Requests.indexRequest.index("my-index").`type`("my-type").source(json)
-  }
-}))
-{% endhighlight %}
-</div>
-</div>
-
-Note how a Map of Strings is used to configure the Sink. The configuration keys
-are documented in the Elasticsearch documentation
-[here](https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html).
-Especially important is the `cluster.name` parameter that must correspond to
-the name of your cluster.
-
-Internally, the sink uses a `BulkProcessor` to send index requests to the cluster.
-This will buffer elements before sending a request to the cluster. The behaviour of the
-`BulkProcessor` can be configured using these config keys:
- * **bulk.flush.max.actions**: Maximum amount of elements to buffer
- * **bulk.flush.max.size.mb**: Maximum amount of data (in megabytes) to buffer
- * **bulk.flush.interval.ms**: Interval at which to flush data regardless of the other two
-  settings in milliseconds
-
-This example code does the same, but with a `TransportClient`:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<String> input = ...;
-
-Map<String, String> config = Maps.newHashMap();
-// This instructs the sink to emit after every element, otherwise they would be buffered
-config.put("bulk.flush.max.actions", "1");
-config.put("cluster.name", "my-cluster-name");
-
-List<TransportAddress> transports = new ArrayList<String>();
-transports.add(new InetSocketTransportAddress("node-1", 9300));
-transports.add(new InetSocketTransportAddress("node-2", 9300));
-
-input.addSink(new ElasticsearchSink<>(config, transports, new IndexRequestBuilder<String>() {
-    @Override
-    public IndexRequest createIndexRequest(String element, RuntimeContext ctx) {
-        Map<String, Object> json = new HashMap<>();
-        json.put("data", element);
-
-        return Requests.indexRequest()
-                .index("my-index")
-                .type("my-type")
-                .source(json);
-    }
-}));
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[String] = ...
-
-val config = new util.HashMap[String, String]
-config.put("bulk.flush.max.actions", "1")
-config.put("cluster.name", "my-cluster-name")
-
-val transports = new ArrayList[String]
-transports.add(new InetSocketTransportAddress("node-1", 9300))
-transports.add(new InetSocketTransportAddress("node-2", 9300))
-
-text.addSink(new ElasticsearchSink(config, transports, new IndexRequestBuilder[String] {
-  override def createIndexRequest(element: String, ctx: RuntimeContext): IndexRequest = {
-    val json = new util.HashMap[String, AnyRef]
-    json.put("data", element)
-    println("SENDING: " + element)
-    Requests.indexRequest.index("my-index").`type`("my-type").source(json)
-  }
-}))
-{% endhighlight %}
-</div>
-</div>
-
-The difference is that we now need to provide a list of Elasticsearch Nodes
-to which the sink should connect using a `TransportClient`.
-
-More about information about Elasticsearch can be found [here](https://elastic.co).
-
-[Back to top](#top)
-
-### Hadoop FileSystem
-
-This connector provides a Sink that writes rolling files to any filesystem supported by
-Hadoop FileSystem. To use this connector, add the
-following dependency to your project:
-
-{% highlight xml %}
-<dependency>
-  <groupId>org.apache.flink</groupId>
-  <artifactId>flink-connector-filesystem</artifactId>
-  <version>{{site.version}}</version>
-</dependency>
-{% endhighlight %}
-
-Note that the streaming connectors are currently not part of the binary
-distribution. See
-[here](cluster_execution.html#linking-with-modules-not-contained-in-the-binary-distribution)
-for information about how to package the program with the libraries for
-cluster execution.
-
-#### Rolling File Sink
-
-The rolling behaviour as well as the writing can be configured but we will get to that later.
-This is how you can create a default rolling sink:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<String> input = ...;
-
-input.addSink(new RollingSink<String>("/base/path"));
-
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[String] = ...
-
-input.addSink(new RollingSink("/base/path"))
-
-{% endhighlight %}
-</div>
-</div>
-
-The only required parameter is the base path where the rolling files (buckets) will be
-stored. The sink can be configured by specifying a custom bucketer, writer and batch size.
-
-By default the rolling sink will use the pattern `"yyyy-MM-dd--HH"` to name the rolling buckets.
-This pattern is passed to `SimpleDateFormat` with the current system time to form a bucket path. A
-new bucket will be created whenever the bucket path changes. For example, if you have a pattern
-that contains minutes as the finest granularity you will get a new bucket every minute.
-Each bucket is itself a directory that contains several part files: Each parallel instance
-of the sink will create its own part file and when part files get too big the sink will also
-create a new part file next to the others. To specify a custom bucketer use `setBucketer()`
-on a `RollingSink`.
-
-The default writer is `StringWriter`. This will call `toString()` on the incoming elements
-and write them to part files, separated by newline. To specify a custom writer use `setWriter()`
-on a `RollingSink`. If you want to write Hadoop SequenceFiles you can use the provided
-`SequenceFileWriter` which can also be configured to use compression.
-
-The last configuration option is the batch size. This specifies when a part file should be closed
-and a new one started. (The default part file size is 384 MB).
-
-Example:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<Tuple2<IntWritable,Text>> input = ...;
-
-RollingSink sink = new RollingSink<String>("/base/path");
-sink.setBucketer(new DateTimeBucketer("yyyy-MM-dd--HHmm"));
-sink.setWriter(new SequenceFileWriter<IntWritable, Text>());
-sink.setBatchSize(1024 * 1024 * 400); // this is 400 MB,
-
-input.addSink(sink);
-
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[Tuple2[IntWritable, Text]] = ...
-
-val sink = new RollingSink[String]("/base/path")
-sink.setBucketer(new DateTimeBucketer("yyyy-MM-dd--HHmm"))
-sink.setWriter(new SequenceFileWriter[IntWritable, Text]())
-sink.setBatchSize(1024 * 1024 * 400) // this is 400 MB,
-
-input.addSink(sink)
-
-{% endhighlight %}
-</div>
-</div>
-
-This will create a sink that writes to bucket files that follow this schema:
-
-```
-/base/path/{date-time}/part-{parallel-task}-{count}
-```
-
-Where `date-time` is the string that we get from the date/time format, `parallel-task` is the index
-of the parallel sink instance and `count` is the running number of part files that where created
-because of the batch size.
-
-For in-depth information, please refer to the JavaDoc for
-[RollingSink](http://flink.apache.org/docs/latest/api/java/org/apache/flink/streaming/connectors/fs/RollingSink.html).
-
-[Back to top](#top)
-
-### RabbitMQ
-
-This connector provides access to data streams from [RabbitMQ](http://www.rabbitmq.com/). To use this connector, add the following dependency to your project:
-
-{% highlight xml %}
-<dependency>
-  <groupId>org.apache.flink</groupId>
-  <artifactId>flink-connector-rabbitmq</artifactId>
-  <version>{{site.version }}</version>
-</dependency>
-{% endhighlight %}
-
-Note that the streaming connectors are currently not part of the binary distribution. See linking with them for cluster execution [here](cluster_execution.html#linking-with-modules-not-contained-in-the-binary-distribution).
-
-#### Installing RabbitMQ
-Follow the instructions from the [RabbitMQ download page](http://www.rabbitmq.com/download.html). After the installation the server automatically starts, and the application connecting to RabbitMQ can be launched.
-
-#### RabbitMQ Source
-
-A class which provides an interface for receiving data from RabbitMQ.
-
-The followings have to be provided for the `RMQSource(…)` constructor in order:
-
-- hostName: The RabbitMQ broker hostname.
-- queueName: The RabbitMQ queue name.
-- usesCorrelationId: `true` when correlation ids should be used, `false` otherwise (default is `false`).
-- deserializationScehma: Deserialization schema to turn messages into Java objects.
-
-This source can be operated in three different modes:
-
-1. Exactly-once (when checkpointed) with RabbitMQ transactions and messages with
-    unique correlation IDs.
-2. At-least-once (when checkpointed) with RabbitMQ transactions but no deduplication mechanism
-    (correlation id is not set).
-3. No strong delivery guarantees (without checkpointing) with RabbitMQ auto-commit mode.
-
-Correlation ids are a RabbitMQ application feature. You have to set it in the message properties
-when injecting messages into RabbitMQ. If you set `usesCorrelationId` to true and do not supply
-unique correlation ids, the source will throw an exception (if the correlation id is null) or ignore
-messages with non-unique correlation ids. If you set `usesCorrelationId` to false, then you don't
-have to supply correlation ids.
-
-Example:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<String> streamWithoutCorrelationIds = env
-	.addSource(new RMQSource<String>("localhost", "hello", new SimpleStringSchema()))
-	.print
-
-DataStream<String> streamWithCorrelationIds = env
-	.addSource(new RMQSource<String>("localhost", "hello", true, new SimpleStringSchema()))
-	.print
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-streamWithoutCorrelationIds = env
-    .addSource(new RMQSource[String]("localhost", "hello", new SimpleStringSchema))
-    .print
-
-streamWithCorrelationIds = env
-    .addSource(new RMQSource[String]("localhost", "hello", true, new SimpleStringSchema))
-    .print
-{% endhighlight %}
-</div>
-</div>
-
-#### RabbitMQ Sink
-A class providing an interface for sending data to RabbitMQ.
-
-The followings have to be provided for the `RMQSink(…)` constructor in order:
-
-1. The hostname
-2. The queue name
-3. Serialization schema
-
-Example:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-stream.addSink(new RMQSink<String>("localhost", "hello", new StringToByteSerializer()));
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-stream.addSink(new RMQSink[String]("localhost", "hello", new StringToByteSerializer))
-{% endhighlight %}
-</div>
-</div>
-
-More about RabbitMQ can be found [here](http://www.rabbitmq.com/).
-
-[Back to top](#top)
-
-### Twitter Streaming API
-
-Twitter Streaming API provides opportunity to connect to the stream of tweets made available by Twitter. Flink Streaming comes with a built-in `TwitterSource` class for establishing a connection to this stream. To use this connector, add the following dependency to your project:
-
-{% highlight xml %}
-<dependency>
-  <groupId>org.apache.flink</groupId>
-  <artifactId>flink-connector-twitter</artifactId>
-  <version>{{site.version }}</version>
-</dependency>
-{% endhighlight %}
-
-Note that the streaming connectors are currently not part of the binary distribution. See linking with them for cluster execution [here](cluster_execution.html#linking-with-modules-not-contained-in-the-binary-distribution).
-
-#### Authentication
-In order to connect to Twitter stream the user has to register their program and acquire the necessary information for the authentication. The process is described below.
-
-#### Acquiring the authentication information
-First of all, a Twitter account is needed. Sign up for free at [twitter.com/signup](https://twitter.com/signup) or sign in at Twitter's [Application Management](https://apps.twitter.com/) and register the application by clicking on the "Create New App" button. Fill out a form about your program and accept the Terms and Conditions.
-After selecting the application, the API key and API secret (called `consumerKey` and `consumerSecret` in `TwitterSource` respectively) is located on the "API Keys" tab. The necessary OAuth Access Token data (`token` and `secret` in `TwitterSource`) can be generated and acquired on the "Keys and Access Tokens" tab.
-Remember to keep these pieces of information secret and do not push them to public repositories.
-
-#### Accessing the authentication information
-Create a properties file, and pass its path in the constructor of `TwitterSource`. The content of the file should be similar to this:
-
-~~~bash
-#properties file for my app
-secret=***
-consumerSecret=***
-token=***-***
-consumerKey=***
-~~~
-
-#### Constructors
-The `TwitterSource` class has two constructors.
-
-1. `public TwitterSource(String authPath, int numberOfTweets);`
-to emit a finite number of tweets
-2. `public TwitterSource(String authPath);`
-for streaming
-
-Both constructors expect a `String authPath` argument determining the location of the properties file containing the authentication information. In the first case, `numberOfTweets` determines how many tweet the source emits.
-
-#### Usage
-In contrast to other connectors, the `TwitterSource` depends on no additional services. For example the following code should run gracefully:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<String> streamSource = env.addSource(new TwitterSource("/PATH/TO/myFile.properties"));
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-streamSource = env.addSource(new TwitterSource("/PATH/TO/myFile.properties"))
-{% endhighlight %}
-</div>
-</div>
-
-The `TwitterSource` emits strings containing a JSON code.
-To retrieve information from the JSON code you can add a FlatMap or a Map function handling JSON code. For example, there is an implementation `JSONParseFlatMap` abstract class among the examples. `JSONParseFlatMap` is an extension of the `FlatMapFunction` and has a
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-String getField(String jsonText, String field);
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-getField(jsonText : String, field : String) : String
-{% endhighlight %}
-</div>
-</div>
-
-function which can be use to acquire the value of a given field.
-
-There are two basic types of tweets. The usual tweets contain information such as date and time of creation, id, user, language and many more details. The other type is the delete information.
-
-#### Example
-`TwitterStream` is an example of how to use `TwitterSource`. It implements a language frequency counter program.
-
-[Back to top](#top)
-
-### Docker containers for connectors
-
-A Docker container is provided with all the required configurations for test running the connectors of Apache Flink. The servers for the message queues will be running on the docker container while the example topology can be run on the user's computer.
-
-#### Installing Docker
-The official Docker installation guide can be found [here](https://docs.docker.com/installation/).
-After installing Docker an image can be pulled for each connector. Containers can be started from these images where all the required configurations are set.
-
-#### Creating a jar with all the dependencies
-For the easiest setup, create a jar with all the dependencies of the *flink-streaming-connectors* project.
-
-~~~bash
-cd /PATH/TO/GIT/flink/flink-staging/flink-streaming-connectors
-mvn assembly:assembly
-~~~bash
-
-This creates an assembly jar under *flink-streaming-connectors/target*.
-
-#### RabbitMQ
-Pull the docker image:
-
-~~~bash
-sudo docker pull flinkstreaming/flink-connectors-rabbitmq
-~~~
-
-To run the container, type:
-
-~~~bash
-sudo docker run -p 127.0.0.1:5672:5672 -t -i flinkstreaming/flink-connectors-rabbitmq
-~~~
-
-Now a terminal has started running from the image with all the necessary configurations to test run the RabbitMQ connector. The -p flag binds the localhost's and the Docker container's ports so RabbitMQ can communicate with the application through these.
-
-To start the RabbitMQ server:
-
-~~~bash
-sudo /etc/init.d/rabbitmq-server start
-~~~
-
-To launch the example on the host computer, execute:
-
-~~~bash
-java -cp /PATH/TO/JAR-WITH-DEPENDENCIES org.apache.flink.streaming.connectors.rabbitmq.RMQTopology \
-> log.txt 2> errorlog.txt
-~~~
-
-There are two connectors in the example. One that sends messages to RabbitMQ, and one that receives messages from the same queue. In the logger messages, the arriving messages can be observed in the following format:
-
-~~~
-<DATE> INFO rabbitmq.RMQTopology: String: <one> arrived from RMQ
-<DATE> INFO rabbitmq.RMQTopology: String: <two> arrived from RMQ
-<DATE> INFO rabbitmq.RMQTopology: String: <three> arrived from RMQ
-<DATE> INFO rabbitmq.RMQTopology: String: <four> arrived from RMQ
-<DATE> INFO rabbitmq.RMQTopology: String: <five> arrived from RMQ
-~~~
-
-#### Apache Kafka
-
-Pull the image:
-
-~~~bash
-sudo docker pull flinkstreaming/flink-connectors-kafka
-~~~
-
-To run the container type:
-
-~~~bash
-sudo docker run -p 127.0.0.1:2181:2181 -p 127.0.0.1:9092:9092 -t -i \
-flinkstreaming/flink-connectors-kafka
-~~~
-
-Now a terminal has started running from the image with all the necessary configurations to test run the Kafka connector. The -p flag binds the localhost's and the Docker container's ports so Kafka can communicate with the application through these.
-First start a zookeeper in the background:
-
-~~~bash
-/kafka_2.9.2-0.8.1.1/bin/zookeeper-server-start.sh /kafka_2.9.2-0.8.1.1/config/zookeeper.properties \
-> zookeeperlog.txt &
-~~~
-
-Then start the kafka server in the background:
-
-~~~bash
-/kafka_2.9.2-0.8.1.1/bin/kafka-server-start.sh /kafka_2.9.2-0.8.1.1/config/server.properties \
- > serverlog.txt 2> servererr.txt &
-~~~
-
-To launch the example on the host computer execute:
-
-~~~bash
-java -cp /PATH/TO/JAR-WITH-DEPENDENCIES org.apache.flink.streaming.connectors.kafka.KafkaTopology \
-> log.txt 2> errorlog.txt
-~~~
-
-
-In the example there are two connectors. One that sends messages to Kafka, and one that receives messages from the same queue. In the logger messages, the arriving messages can be observed in the following format:
-
-~~~
-<DATE> INFO kafka.KafkaTopology: String: (0) arrived from Kafka
-<DATE> INFO kafka.KafkaTopology: String: (1) arrived from Kafka
-<DATE> INFO kafka.KafkaTopology: String: (2) arrived from Kafka
-<DATE> INFO kafka.KafkaTopology: String: (3) arrived from Kafka
-<DATE> INFO kafka.KafkaTopology: String: (4) arrived from Kafka
-<DATE> INFO kafka.KafkaTopology: String: (5) arrived from Kafka
-<DATE> INFO kafka.KafkaTopology: String: (6) arrived from Kafka
-<DATE> INFO kafka.KafkaTopology: String: (7) arrived from Kafka
-<DATE> INFO kafka.KafkaTopology: String: (8) arrived from Kafka
-<DATE> INFO kafka.KafkaTopology: String: (9) arrived from Kafka
-~~~
-
-
-[Back to top](#top)
+{% top %}
 
 Program Packaging & Distributed Execution
 -----------------------------------------
 
-See [the relevant section of the DataSet API documentation](programming_guide.html#program-packaging-and-distributed-execution).
+See [the relevant section of the DataSet API documentation]({{ site.baseurl }}/apis/batch/index.html#program-packaging-and-distributed-execution).
 
-[Back to top](#top)
+{% top %}
 
 Parallel Execution
 ------------------
 
-See [the relevant section of the DataSet API documentation](programming_guide.html#parallel-execution).
+See [the relevant section of the DataSet API documentation]({{ site.baseurl }}/apis/batch/index.html#parallel-execution).
 
-[Back to top](#top)
+{% top %}
 
 Execution Plans
 ---------------
 
-See [the relevant section of the DataSet API documentation](programming_guide.html#execution-plans).
+See [the relevant section of the DataSet API documentation]({{ site.baseurl }}/apis/batch/index.html#execution-plans).
 
-[Back to top](#top)
+{% top %}
diff --git a/docs/apis/savepoints.md b/docs/apis/streaming/savepoints.md
similarity index 99%
rename from docs/apis/savepoints.md
rename to docs/apis/streaming/savepoints.md
index 80bd83dfcb4e4..ee4155f0041f3 100644
--- a/docs/apis/savepoints.md
+++ b/docs/apis/streaming/savepoints.md
@@ -1,6 +1,8 @@
 ---
 title: "Savepoints"
 is_beta: false
+sub-nav-group: streaming
+sub-nav-pos: 4
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/apis/state_backends.md b/docs/apis/streaming/state_backends.md
similarity index 98%
rename from docs/apis/state_backends.md
rename to docs/apis/streaming/state_backends.md
index ad191f92c8526..06f9b24a093dc 100644
--- a/docs/apis/state_backends.md
+++ b/docs/apis/streaming/state_backends.md
@@ -1,5 +1,8 @@
 ---
 title:  "State Backends"
+sub-nav-group: streaming
+sub-nav-pos: 1
+sub-nav-parent: fault_tolerance
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -61,7 +64,7 @@ The MemoryStateBackend is encouraged for:
 
 ### The FsStateBackend
 
-The *FsStateBackend* (FileSystemStateBackend) is configured with a file system URL (type, address, path), such as for example "hdfs://namenode:40010/flink/checkpoints" or "file:///data/flink/checkpoints". 
+The *FsStateBackend* (FileSystemStateBackend) is configured with a file system URL (type, address, path), such as for example "hdfs://namenode:40010/flink/checkpoints" or "file:///data/flink/checkpoints".
 
 The FsStateBackend holds in-flight data in the TaskManager's memory. Upon checkpoints, it writes state snapshots into files in the configured file system and directory. Minimal metadata is stored in the JobManager's memory (or, in high-availability mode, in the metadata checkpoint).
 
@@ -118,4 +121,3 @@ state.backend: filesystem
 
 state.backend.fs.checkpointdir: hdfs://namenode:40010/flink/checkpoints
 ~~~
-
diff --git a/docs/apis/storm_compatibility.md b/docs/apis/streaming/storm_compatibility.md
similarity index 99%
rename from docs/apis/storm_compatibility.md
rename to docs/apis/streaming/storm_compatibility.md
index 852bbef76d6a8..0ea0b01919628 100644
--- a/docs/apis/storm_compatibility.md
+++ b/docs/apis/streaming/storm_compatibility.md
@@ -1,6 +1,8 @@
 ---
 title: "Storm Compatibility"
 is_beta: true
+sub-nav-group: streaming
+sub-nav-pos: 5
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -97,7 +99,7 @@ if(runLocal) { // submit to test cluster
 </div>
 </div>
 
-# Embed Storm Operators in Flink Streaming Programs 
+# Embed Storm Operators in Flink Streaming Programs
 
 As an alternative, Spouts and Bolts can be embedded into regular streaming programs.
 The Storm compatibility layer offers a wrapper classes for each, namely `SpoutWrapper` and `BoltWrapper` (`org.apache.flink.storm.wrappers`).
diff --git a/docs/apis/web_client.md b/docs/apis/web_client.md
index 4749d90497171..601ffc6daaa63 100644
--- a/docs/apis/web_client.md
+++ b/docs/apis/web_client.md
@@ -1,5 +1,8 @@
 ---
 title:  "Web Client"
+# Top-level navigation
+top-nav-group: apis
+top-nav-pos: 6
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/apis/zip_elements_guide.md b/docs/apis/zip_elements_guide.md
deleted file mode 100644
index b636fe446c44e..0000000000000
--- a/docs/apis/zip_elements_guide.md
+++ /dev/null
@@ -1,106 +0,0 @@
----
-title: "Zipping Elements in a DataSet"
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-In certain algorithms, one may need to assign unique identifiers to data set elements.
-This document shows how {% gh_link /flink-java/src/main/java/org/apache/flink/api/java/utils/DataSetUtils.java "DataSetUtils" %} can be used for that purpose.
-
-* This will be replaced by the TOC
-{:toc}
-
-### Zip with a Dense Index
-For assigning consecutive labels to the elements, the `zipWithIndex` method should be called. It receives a data set as input and returns a new data set of unique id, initial value tuples.
-For example, the following code:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
-env.setParallelism(1);
-DataSet<String> in = env.fromElements("A", "B", "C", "D", "E", "F");
-
-DataSet<Tuple2<Long, String>> result = DataSetUtils.zipWithIndex(in);
-
-result.writeAsCsv(resultPath, "\n", ",");
-env.execute();
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-import org.apache.flink.api.scala._
-
-val env: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment
-env.setParallelism(1)
-val input: DataSet[String] = env.fromElements("A", "B", "C", "D", "E", "F")
-
-val result: DataSet[(Long, String)] = input.zipWithIndex
-
-result.writeAsCsv(resultPath, "\n", ",")
-env.execute()
-{% endhighlight %}
-</div>
-
-</div>
-
-will yield the tuples: (0,A), (1,B), (2,C), (3,D), (4,E), (5,F)
-
-[Back to top](#top)
-
-### Zip with an Unique Identifier
-In many cases, one may not need to assign consecutive labels.
-`zipWIthUniqueId` works in a pipelined fashion, speeding up the label assignment process. This method receives a data set as input and returns a new data set of unique id, initial value tuples.
-For example, the following code:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
-env.setParallelism(1);
-DataSet<String> in = env.fromElements("A", "B", "C", "D", "E", "F");
-
-DataSet<Tuple2<Long, String>> result = DataSetUtils.zipWithUniqueId(in);
-
-result.writeAsCsv(resultPath, "\n", ",");
-env.execute();
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-import org.apache.flink.api.scala._
-
-val env: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment
-env.setParallelism(1)
-val input: DataSet[String] = env.fromElements("A", "B", "C", "D", "E", "F")
-
-val result: DataSet[(Long, String)] = input.zipWithUniqueId
-
-result.writeAsCsv(resultPath, "\n", ",")
-env.execute()
-{% endhighlight %}
-</div>
-
-</div>
-
-will yield the tuples: (0,A), (2,B), (4,C), (6,D), (8,E), (10,F)
-
-[Back to top](#top)
\ No newline at end of file
diff --git a/docs/internals/add_operator.md b/docs/internals/add_operator.md
index 241304d052ec9..8dad0af534818 100644
--- a/docs/internals/add_operator.md
+++ b/docs/internals/add_operator.md
@@ -1,5 +1,9 @@
 ---
 title:  "How to add a new Operator"
+# Top navigation
+top-nav-group: internals
+top-nav-pos: 8
+top-nav-title: "How-To: Add an Operator"
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/internals/general_arch.md b/docs/internals/general_arch.md
index 4628e0b1dd097..16e709331eb5c 100644
--- a/docs/internals/general_arch.md
+++ b/docs/internals/general_arch.md
@@ -1,5 +1,9 @@
 ---
 title:  "General Architecture and Process Model"
+# Top navigation
+top-nav-group: internals
+top-nav-pos: 3
+top-nav-title: Architecture and Process Model
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/internals/ide_setup.md b/docs/internals/ide_setup.md
index 1e0e77a40d612..1b0b91a9c2ea1 100644
--- a/docs/internals/ide_setup.md
+++ b/docs/internals/ide_setup.md
@@ -1,5 +1,8 @@
 ---
-title: "IDE setup"
+title: "IDE Setup"
+# Top navigation
+top-nav-group: internals
+top-nav-pos: 1
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/internals/job_scheduling.md b/docs/internals/job_scheduling.md
index 7e24cdbbc8937..cce78d911431a 100644
--- a/docs/internals/job_scheduling.md
+++ b/docs/internals/job_scheduling.md
@@ -1,5 +1,8 @@
 ---
 title:  "Jobs and Scheduling"
+# Top navigation
+top-nav-group: internals
+top-nav-pos: 7
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/internals/logging.md b/docs/internals/logging.md
index dee3d016f9b57..d2c0cbaefa2b9 100644
--- a/docs/internals/logging.md
+++ b/docs/internals/logging.md
@@ -1,5 +1,9 @@
 ---
 title: "How to use logging"
+# Top navigation
+top-nav-group: internals
+top-nav-pos: 2
+top-nav-title: Logging
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/internals/monitoring_rest_api.md b/docs/internals/monitoring_rest_api.md
index 643db6b88f0a3..70952f5272a62 100644
--- a/docs/internals/monitoring_rest_api.md
+++ b/docs/internals/monitoring_rest_api.md
@@ -1,5 +1,8 @@
 ---
 title:  "Monitoring REST API"
+# Top navigation
+top-nav-group: internals
+top-nav-pos: 6
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/internals/stream_checkpointing.md b/docs/internals/stream_checkpointing.md
index 48355a1fd7a3d..ae93b08d0bddd 100644
--- a/docs/internals/stream_checkpointing.md
+++ b/docs/internals/stream_checkpointing.md
@@ -1,5 +1,9 @@
 ---
 title:  "Data Streaming Fault Tolerance"
+# Top navigation
+top-nav-group: internals
+top-nav-pos: 4
+top-nav-title: Fault Tolerance for Data Streaming
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/internals/types_serialization.md b/docs/internals/types_serialization.md
index 8a93ccd519c83..7ff21f2070e1c 100644
--- a/docs/internals/types_serialization.md
+++ b/docs/internals/types_serialization.md
@@ -1,5 +1,8 @@
 ---
 title:  "Type Extraction and Serialization"
+# Top navigation
+top-nav-group: internals
+top-nav-pos: 5
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/libs/gelly_guide.md b/docs/libs/gelly_guide.md
index ff145063b4a69..a3ede7b3e2690 100644
--- a/docs/libs/gelly_guide.md
+++ b/docs/libs/gelly_guide.md
@@ -1,5 +1,14 @@
 ---
 title: "Gelly: Flink Graph API"
+# Top navigation
+top-nav-group: libs
+top-nav-pos: 1
+top-nav-title: "Graphs: Gelly"
+# Sub navigation
+sub-nav-group: batch
+sub-nav-parent: libs
+sub-nav-pos: 1
+sub-nav-title: Gelly
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -20,8 +29,6 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-<a href="#top"></a>
-
 Gelly is a Graph API for Flink. It contains a set of methods and utilities which aim to simplify the development of graph analysis applications in Flink. In Gelly, graphs can be transformed and modified using high-level functions similar to the ones provided by the batch processing API. Gelly provides methods to create, transform and modify graphs, as well as a library of graph algorithms.
 
 * This will be replaced by the TOC
@@ -114,7 +121,7 @@ val weight = e.getValue // weight = 0.5
 </div>
 </div>
 
-[Back to top](#top)
+{% top %}
 
 Graph Creation
 -----------
@@ -219,11 +226,11 @@ val graph = Graph.fromTupleDataSet(vertexTuples, edgeTuples, env)
 {% endhighlight %}
 
 * from a CSV file of Edge data and an optional CSV file of Vertex data.
-In this case, Gelly will convert each row from the Edge CSV file to an `Edge`. 
-The first field of the each row will be the source ID, the second field will be the target ID and the third field (if present) will be the edge value. 
+In this case, Gelly will convert each row from the Edge CSV file to an `Edge`.
+The first field of the each row will be the source ID, the second field will be the target ID and the third field (if present) will be the edge value.
 If the edges have no associated value, set the edge value type parameter (3rd type argument) to `NullValue`.
 You can also specify that the vertices are initialized with a vertex value.
-If you provide a path to a CSV file via `pathVertices`, each row of this file will be converted to a `Vertex`. 
+If you provide a path to a CSV file via `pathVertices`, each row of this file will be converted to a `Vertex`.
 The first field of each row will be the vertex ID and the second field will be the vertex value.
 If you provide a vertex value initializer `MapFunction` via the `vertexValueInitializer` parameter, then this function is used to generate the vertex values.
 The set of vertices will be created automatically from the edges input.
@@ -244,7 +251,7 @@ val graph = Graph.fromCsvReader[String, Long, Double](
 val simpleGraph = Graph.fromCsvReader[Long, NullValue, NullValue](
 		pathEdges = "path/to/edge/input",
 		env = env)
-		
+
 // create a Graph with Double Vertex values generated by a vertex value initializer and no Edge values
 val simpleGraph = Graph.fromCsvReader[Long, Double, NullValue](
         pathEdges = "path/to/edge/input",
@@ -279,11 +286,11 @@ If no vertex input is provided during Graph creation, Gelly will automatically p
 ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
 
 // initialize the vertex value to be equal to the vertex ID
-Graph<Long, Long, String> graph = Graph.fromCollection(edgeList, 
+Graph<Long, Long, String> graph = Graph.fromCollection(edgeList,
 				new MapFunction<Long, Long>() {
-					public Long map(Long value) { 
-						return value; 
-					} 
+					public Long map(Long value) {
+						return value;
+					}
 				}, env);
 {% endhighlight %}
 </div>
@@ -313,7 +320,7 @@ val graph = Graph.fromCollection(edgeList,
 </div>
 </div>
 
-[Back to top](#top)
+{% top %}
 
 Graph Properties
 ------------
@@ -333,10 +340,10 @@ DataSet<Edge<K, EV>> getEdges()
 DataSet<K> getVertexIds()
 
 // get the source-target pairs of the edge IDs as a DataSet
-DataSet<Tuple2<K, K>> getEdgeIds() 
+DataSet<Tuple2<K, K>> getEdgeIds()
 
 // get a DataSet of <vertex ID, in-degree> pairs for all vertices
-DataSet<Tuple2<K, Long>> inDegrees() 
+DataSet<Tuple2<K, Long>> inDegrees()
 
 // get a DataSet of <vertex ID, out-degree> pairs for all vertices
 DataSet<Tuple2<K, Long>> outDegrees()
@@ -392,7 +399,7 @@ getTriplets: DataSet[Triplet[K, VV, EV]]
 </div>
 </div>
 
-[Back to top](#top)
+{% top %}
 
 Graph Transformations
 -----------------
@@ -510,14 +517,14 @@ val networkWithWeights = network.joinWithEdgesOnSource(vertexOutDegrees, (v1: Do
 * <strong>Difference</strong>: Gelly's `difference()` method performs a difference on the vertex and edge sets of the current graph and the specified graph.
 
 * <strong>Intersect</strong>: Gelly's `intersect()` method performs an intersect on the edge
- sets of the current graph and the specified graph. The result is a new `Graph` that contains all 
+ sets of the current graph and the specified graph. The result is a new `Graph` that contains all
  edges that exist in both input graphs. Two edges are considered equal, if they have the same source
- identifier, target identifier and edge value. Vertices in the resulting graph have no 
- value. If vertex values are required, one can for example retrieve them from one of the input graphs using 
+ identifier, target identifier and edge value. Vertices in the resulting graph have no
+ value. If vertex values are required, one can for example retrieve them from one of the input graphs using
  the `joinWithVertices()` method.
- Depending on the parameter `distinct`, equal edges are either contained once in the resulting 
+ Depending on the parameter `distinct`, equal edges are either contained once in the resulting
  `Graph` or as often as there are pairs of equal edges in the input graphs.
- 
+
 <div class="codetabs" markdown="1">
 <div data-lang="java" markdown="1">
 {% highlight java %}
@@ -562,7 +569,7 @@ val intersect2 = graph1.intersect(graph2, false)
 </div>
 </div>
 
--[Back to top](#top)
+-{% top %}
 
 Graph Mutations
 -----------
@@ -734,7 +741,7 @@ Graph<Long, Long, Double> graph = ...
 DataSet<Tuple2<Vertex<Long, Long>, Vertex<Long, Long>>> vertexPairs = graph.groupReduceOnNeighbors(new SelectLargeWeightNeighbors(), EdgeDirection.OUT);
 
 // user-defined function to select the neighbors which have edges with weight > 0.5
-static final class SelectLargeWeightNeighbors implements NeighborsFunctionWithVertexValue<Long, Long, Double, 
+static final class SelectLargeWeightNeighbors implements NeighborsFunctionWithVertexValue<Long, Long, Double,
 		Tuple2<Vertex<Long, Long>, Vertex<Long, Long>>> {
 
 		@Override
@@ -759,7 +766,7 @@ val graph: Graph[Long, Long, Double] = ...
 val vertexPairs = graph.groupReduceOnNeighbors(new SelectLargeWeightNeighbors, EdgeDirection.OUT)
 
 // user-defined function to select the neighbors which have edges with weight > 0.5
-final class SelectLargeWeightNeighbors extends NeighborsFunctionWithVertexValue[Long, Long, Double, 
+final class SelectLargeWeightNeighbors extends NeighborsFunctionWithVertexValue[Long, Long, Double,
   (Vertex[Long, Long], Vertex[Long, Long])] {
 
 	override def iterateNeighbors(vertex: Vertex[Long, Long],
@@ -779,7 +786,7 @@ final class SelectLargeWeightNeighbors extends NeighborsFunctionWithVertexValue[
 
 When the aggregation computation does not require access to the vertex value (for which the aggregation is performed), it is advised to use the more efficient `EdgesFunction` and `NeighborsFunction` for the user-defined functions. When access to the vertex value is required, one should use `EdgesFunctionWithVertexValue` and `NeighborsFunctionWithVertexValue` instead.
 
-[Back to top](#top)
+{% top %}
 
 Iterative Graph Processing
 -----------
@@ -809,7 +816,7 @@ Let us consider computing Single-Source-Shortest-Paths with vertex-centric itera
 
 <div class="codetabs" markdown="1">
 <div data-lang="java" markdown="1">
-{% highlight java %} 
+{% highlight java %}
 // read the input graph
 Graph<Long, Double, Double> graph = ...
 
@@ -858,7 +865,7 @@ public static final class VertexDistanceUpdater extends VertexUpdateFunction<Lon
 </div>
 
 <div data-lang="scala" markdown="1">
-{% highlight scala %} 
+{% highlight scala %}
 // read the input graph
 val graph: Graph[Long, Double, Double] = ...
 
@@ -906,23 +913,23 @@ final class VertexDistanceUpdater extends VertexUpdateFunction[Long, Double, Dou
 </div>
 </div>
 
-[Back to top](#top)
+{% top %}
 
 ### Configuring a Vertex-Centric Iteration
 A vertex-centric iteration can be configured using a `VertexCentricConfiguration` object.
 Currently, the following parameters can be specified:
 
-* <strong>Name</strong>: The name for the vertex-centric iteration. The name is displayed in logs and messages 
+* <strong>Name</strong>: The name for the vertex-centric iteration. The name is displayed in logs and messages
 and can be specified using the `setName()` method.
 
-* <strong>Parallelism</strong>: The parallelism for the iteration. It can be set using the `setParallelism()` method.	
+* <strong>Parallelism</strong>: The parallelism for the iteration. It can be set using the `setParallelism()` method.
 
 * <strong>Solution set in unmanaged memory</strong>: Defines whether the solution set is kept in managed memory (Flink's internal way of keeping objects in serialized form) or as a simple object map. By default, the solution set runs in managed memory. This property can be set using the `setSolutionSetUnmanagedMemory()` method.
 
 * <strong>Aggregators</strong>: Iteration aggregators can be registered using the `registerAggregator()` method. An iteration aggregator combines
 all aggregates globally once per superstep and makes them available in the next superstep. Registered aggregators can be accessed inside the user-defined `VertexUpdateFunction` and `MessagingFunction`.
 
-* <strong>Broadcast Variables</strong>: DataSets can be added as [Broadcast Variables]({{site.baseurl}}/apis/programming_guide.html#broadcast-variables) to the `VertexUpdateFunction` and `MessagingFunction`, using the `addBroadcastSetForUpdateFunction()` and `addBroadcastSetForMessagingFunction()` methods, respectively.
+* <strong>Broadcast Variables</strong>: DataSets can be added as [Broadcast Variables]({{site.baseurl}}/apis/batch/index.html#broadcast-variables) to the `VertexUpdateFunction` and `MessagingFunction`, using the `addBroadcastSetForUpdateFunction()` and `addBroadcastSetForMessagingFunction()` methods, respectively.
 
 * <strong>Number of Vertices</strong>: Accessing the total number of vertices within the iteration. This property can be set using the `setOptNumVertices()` method.
 The number of vertices can then be accessed in the vertex update function and in the messaging function using the `getNumberOfVertices()` method. If the option is not set in the configuration, this method will return -1.
@@ -952,7 +959,7 @@ parameters.setParallelism(16);
 parameters.registerAggregator("sumAggregator", new LongSumAggregator());
 
 // run the vertex-centric iteration, also passing the configuration parameters
-Graph<Long, Double, Double> result = 
+Graph<Long, Double, Double> result =
 			graph.runVertexCentricIteration(
 			new VertexUpdater(), new Messenger(), maxIterations, parameters);
 
@@ -962,14 +969,14 @@ public static final class VertexUpdater extends VertexUpdateFunction {
 	LongSumAggregator aggregator = new LongSumAggregator();
 
 	public void preSuperstep() {
-	
+
 		// retrieve the Aggregator
 		aggregator = getIterationAggregator("sumAggregator");
 	}
 
 
 	public void updateVertex(Vertex<Long, Long> vertex, MessageIterator inMessages) {
-		
+
 		//do some computation
 		Long partialValue = ...
 
@@ -1011,14 +1018,14 @@ final class VertexUpdater extends VertexUpdateFunction {
 	var aggregator = new LongSumAggregator
 
 	override def preSuperstep {
-	
+
 		// retrieve the Aggregator
 		aggregator = getIterationAggregator("sumAggregator")
 	}
 
 
 	override def updateVertex(vertex: Vertex[Long, Long], inMessages: MessageIterator[Long]) {
-		
+
 		//do some computation
 		val partialValue = ...
 
@@ -1162,7 +1169,7 @@ final class Messenger {...}
 </div>
 </div>
 
-[Back to top](#top)
+{% top %}
 
 ### Gather-Sum-Apply Iterations
 Like in the vertex-centric model, Gather-Sum-Apply also proceeds in synchronized iterative steps, called supersteps. Each superstep consists of the following three phases:
@@ -1289,7 +1296,7 @@ Note that `gather` takes a `Neighbor` type as an argument. This is a convenience
 
 For more examples of how to implement algorithms with the Gather-Sum-Apply model, check the {% gh_link /flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/library/GSAPageRank.java "GSAPageRank" %} and {% gh_link /flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/library/GSAConnectedComponents.java "GSAConnectedComponents" %} library methods of Gelly.
 
-[Back to top](#top)
+{% top %}
 
 ### Configuring a Gather-Sum-Apply Iteration
 A GSA iteration can be configured using a `GSAConfiguration` object.
@@ -1303,7 +1310,7 @@ Currently, the following parameters can be specified:
 
 * <strong>Aggregators</strong>: Iteration aggregators can be registered using the `registerAggregator()` method. An iteration aggregator combines all aggregates globally once per superstep and makes them available in the next superstep. Registered aggregators can be accessed inside the user-defined `GatherFunction`, `SumFunction` and `ApplyFunction`.
 
-* <strong>Broadcast Variables</strong>: DataSets can be added as [Broadcast Variables]({{site.baseurl}}/apis/programming_guide.html#broadcast-variables) to the `GatherFunction`, `SumFunction` and `ApplyFunction`, using the methods `addBroadcastSetForGatherFunction()`, `addBroadcastSetForSumFunction()` and `addBroadcastSetForApplyFunction` methods, respectively.
+* <strong>Broadcast Variables</strong>: DataSets can be added as [Broadcast Variables]({{site.baseurl}}/apis/index.html#broadcast-variables) to the `GatherFunction`, `SumFunction` and `ApplyFunction`, using the methods `addBroadcastSetForGatherFunction()`, `addBroadcastSetForSumFunction()` and `addBroadcastSetForApplyFunction` methods, respectively.
 
 * <strong>Number of Vertices</strong>: Accessing the total number of vertices within the iteration. This property can be set using the `setOptNumVertices()` method.
 The number of vertices can then be accessed in the gather, sum and/or apply functions by using the `getNumberOfVertices()` method. If the option is not set in the configuration, this method will return -1.
@@ -1433,7 +1440,7 @@ val result = graph.runGatherSumApplyIteration(new Gather, new Sum, new Apply, ma
 {% endhighlight %}
 </div>
 </div>
-[Back to top](#top)
+{% top %}
 
 ### Vertex-centric and GSA Comparison
 As seen in the examples above, Gather-Sum-Apply iterations are quite similar to vertex-centric iterations. In fact, any algorithm which can be expressed as a GSA iteration can also be written in the vertex-centric model.
@@ -1466,7 +1473,7 @@ List<Edge<Long, Long>> edges = ...
 Graph<Long, Long, Long> graph = Graph.fromCollection(vertices, edges, env);
 
 // will return false: 6 is an invalid ID
-graph.validate(new InvalidVertexIdsValidator<Long, Long, Long>()); 
+graph.validate(new InvalidVertexIdsValidator<Long, Long, Long>());
 
 {% endhighlight %}
 </div>
@@ -1490,7 +1497,7 @@ graph.validate(new InvalidVertexIdsValidator[Long, Long, Long])
 </div>
 </div>
 
-[Back to top](#top)
+{% top %}
 
 Library Methods
 -----------
@@ -1552,7 +1559,7 @@ This library method is an implementation of the community detection algorithm de
 The algorithm is implemented using [vertex-centric iterations](#vertex-centric-iterations).
 Initially, each vertex is assigned a `Tuple2` containing its initial value along with a score equal to 1.0.
 In each iteration, vertices send their labels and scores to their neighbors. Upon receiving messages from its neighbors,
-a vertex chooses the label with the highest score and subsequently re-scores it using the edge values, 
+a vertex chooses the label with the highest score and subsequently re-scores it using the edge values,
 a user-defined hop attenuation parameter, `delta`, and the superstep number.
 The algorithm converges when vertices no longer update their value or when the maximum number of iterations
 is reached.
@@ -1690,25 +1697,25 @@ Each `Tuple3` corresponds to a triangle, with the fields containing the IDs of t
 ### Summarization
 
 #### Overview
-The summarization algorithm computes a condensed version of the input graph by grouping vertices and edges based on 
+The summarization algorithm computes a condensed version of the input graph by grouping vertices and edges based on
 their values. In doing so, the algorithm helps to uncover insights about patterns and distributions in the graph.
 One possible use case is the visualization of communities where the whole graph is too large and needs to be summarized
 based on the community identifier stored at a vertex.
 
 #### Details
-In the resulting graph, each vertex represents a group of vertices that share the same value. An edge, that connects a 
-vertex with itself, represents all edges with the same edge value that connect vertices from the same vertex group. An 
-edge between different vertices in the output graph represents all edges with the same edge value between members of 
+In the resulting graph, each vertex represents a group of vertices that share the same value. An edge, that connects a
+vertex with itself, represents all edges with the same edge value that connect vertices from the same vertex group. An
+edge between different vertices in the output graph represents all edges with the same edge value between members of
 different vertex groups in the input graph.
 
 The algorithm is implemented using Flink data operators. First, vertices are grouped by their value and a representative
-is chosen from each group. For any edge, the source and target vertex identifiers are replaced with the corresponding 
+is chosen from each group. For any edge, the source and target vertex identifiers are replaced with the corresponding
 representative and grouped by source, target and edge value. Output vertices and edges are created from their
 corresponding groupings.
 
 #### Usage
 The algorithm takes a directed, vertex (and possibly edge) attributed graph as input and outputs a new graph where each
-vertex represents a group of vertices and each edge represents a group of edges from the input graph. Furthermore, each 
+vertex represents a group of vertices and each edge represents a group of edges from the input graph. Furthermore, each
 vertex and edge in the output graph stores the common group value and the number of represented elements.
 
-[Back to top](#top)
+{% top %}
diff --git a/docs/libs/index.md b/docs/libs/index.md
index cf5f84621cf4c..b2df0c4ddfa9a 100644
--- a/docs/libs/index.md
+++ b/docs/libs/index.md
@@ -1,5 +1,9 @@
 ---
 title: "Libraries"
+sub-nav-group: batch
+sub-nav-id: libs
+sub-nav-pos: 6
+sub-nav-title: Libraries
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -18,4 +22,8 @@ software distributed under the License is distributed on an
 KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
--->
\ No newline at end of file
+-->
+
+- Graph processing: [Gelly](gelly_guide.html)
+- Machine Learning: [FlinkML](ml/index.html)
+- Relational Queries: [Table](table.html)
diff --git a/docs/libs/ml/als.md b/docs/libs/ml/als.md
index bd45bb0c44573..cf85399a39205 100644
--- a/docs/libs/ml/als.md
+++ b/docs/libs/ml/als.md
@@ -1,7 +1,11 @@
 ---
 mathjax: include
-htmlTitle: FlinkML - Alternating Least Squares
-title: <a href="../ml">FlinkML</a> - Alternating Least Squares
+title: FlinkML - Alternating Least Squares
+
+# Sub navigation
+sub-nav-group: batch
+sub-nav-parent: flinkml
+sub-nav-title: ALS
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/libs/ml/contribution_guide.md b/docs/libs/ml/contribution_guide.md
index d40290be3a91d..63769586547c8 100644
--- a/docs/libs/ml/contribution_guide.md
+++ b/docs/libs/ml/contribution_guide.md
@@ -1,7 +1,11 @@
 ---
 mathjax: include
-htmlTitle: FlinkML - How to Contribute 
-title: <a href="../ml">FlinkML</a> - How to Contribute
+title: FlinkML - How to Contribute
+
+# Sub navigation
+sub-nav-group: batch
+sub-nav-parent: flinkml
+sub-nav-title: How To Contribute
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/libs/ml/distance_metrics.md b/docs/libs/ml/distance_metrics.md
index e6868c83f1c0b..1a7364a3869be 100644
--- a/docs/libs/ml/distance_metrics.md
+++ b/docs/libs/ml/distance_metrics.md
@@ -1,7 +1,11 @@
 ---
 mathjax: include
-htmlTitle: FlinkML - Distance Metrics
-title: <a href="../ml">FlinkML</a> - Distance Metrics
+title: FlinkML - Distance Metrics
+
+# Sub navigation
+sub-nav-group: batch
+sub-nav-parent: flinkml
+sub-nav-title: Distance Metrics
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/libs/ml/index.md b/docs/libs/ml/index.md
index dc35c3491f67b..973c4f3e0d2a5 100644
--- a/docs/libs/ml/index.md
+++ b/docs/libs/ml/index.md
@@ -1,5 +1,15 @@
 ---
 title: "FlinkML - Machine Learning for Flink"
+# Top navigation
+top-nav-group: libs
+top-nav-pos: 2
+top-nav-title: Machine Learning
+# Sub navigation
+sub-nav-group: batch
+sub-nav-id: flinkml
+sub-nav-pos: 2
+sub-nav-parent: libs
+sub-nav-title: Machine Learning
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -69,7 +79,7 @@ Next, you have to add the FlinkML dependency to the `pom.xml` of your project.
 </dependency>
 {% endhighlight %}
 
-Note that FlinkML is currently not part of the binary distribution. 
+Note that FlinkML is currently not part of the binary distribution.
 See linking with it for cluster execution [here]({{site.baseurl}}/apis/cluster_execution.html#linking-with-modules-not-contained-in-the-binary-distribution).
 
 Now you can start solving your analysis task.
diff --git a/docs/libs/ml/min_max_scaler.md b/docs/libs/ml/min_max_scaler.md
index 0c00dcdac77a0..302bf4d91e427 100644
--- a/docs/libs/ml/min_max_scaler.md
+++ b/docs/libs/ml/min_max_scaler.md
@@ -1,7 +1,11 @@
 ---
 mathjax: include
-htmlTitle: FlinkML - MinMax Scaler
 title: <a href="../ml">FlinkML</a> - MinMax Scaler
+
+# Sub navigation
+sub-nav-group: batch
+sub-nav-parent: flinkml
+sub-nav-title: MinMax Scaler
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/libs/ml/multiple_linear_regression.md b/docs/libs/ml/multiple_linear_regression.md
index aaf1fbf003a2c..e0085ae1b83ae 100644
--- a/docs/libs/ml/multiple_linear_regression.md
+++ b/docs/libs/ml/multiple_linear_regression.md
@@ -1,7 +1,11 @@
 ---
 mathjax: include
-htmlTitle: FlinkML - Multiple linear regression
-title: <a href="../ml">FlinkML</a> - Multiple linear regression
+title: FlinkML - Multiple linear regression
+
+# Sub navigation
+sub-nav-group: batch
+sub-nav-parent: flinkml
+sub-nav-title: Multiple Linear Regression
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/libs/ml/optimization.md b/docs/libs/ml/optimization.md
index 110383d6802e3..9bcebaaec6ba7 100644
--- a/docs/libs/ml/optimization.md
+++ b/docs/libs/ml/optimization.md
@@ -1,7 +1,10 @@
 ---
 mathjax: include
-htmlTitle: FlinkML - Optimization
-title: <a href="../ml">FlinkML</a> - Optimization
+title: FlinkML - Optimization
+# Sub navigation
+sub-nav-group: batch
+sub-nav-parent: flinkml
+sub-nav-title: Optimization
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/libs/ml/pipelines.md b/docs/libs/ml/pipelines.md
index 04df321571e91..429156d16bdd6 100644
--- a/docs/libs/ml/pipelines.md
+++ b/docs/libs/ml/pipelines.md
@@ -1,7 +1,10 @@
 ---
 mathjax: include
-htmlTitle: FlinkML - Looking under the hood of piplines
-title: <a href="../ml">FlinkML</a> - Looking under the hood of pipelines
+title: FlinkML - Looking under the hood of pipelines
+# Sub navigation
+sub-nav-group: batch
+sub-nav-parent: flinkml
+sub-nav-title: Pipelines
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/libs/ml/polynomial_features.md b/docs/libs/ml/polynomial_features.md
index 7226455525558..27fb1e932af20 100644
--- a/docs/libs/ml/polynomial_features.md
+++ b/docs/libs/ml/polynomial_features.md
@@ -1,7 +1,10 @@
 ---
 mathjax: include
-htmlTitle: FlinkML - Polynomial Features
-title: <a href="../ml">FlinkML</a> - Polynomial Features
+title: FlinkML - Polynomial Features
+# Sub navigation
+sub-nav-group: batch
+sub-nav-parent: flinkml
+sub-nav-title: Polynomial Features
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/libs/ml/quickstart.md b/docs/libs/ml/quickstart.md
index f5d745192ad1e..7b3c7109cd901 100644
--- a/docs/libs/ml/quickstart.md
+++ b/docs/libs/ml/quickstart.md
@@ -1,7 +1,10 @@
 ---
 mathjax: include
-htmlTitle: FlinkML - Quickstart Guide
-title: <a href="../ml">FlinkML</a> - Quickstart Guide
+title: FlinkML - Quickstart Guide
+# Sub navigation
+sub-nav-group: batch
+sub-nav-parent: flinkml
+sub-nav-title: Quickstart Guide
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/libs/ml/standard_scaler.md b/docs/libs/ml/standard_scaler.md
index dea7e1d1c7149..f6d7b62b56e08 100644
--- a/docs/libs/ml/standard_scaler.md
+++ b/docs/libs/ml/standard_scaler.md
@@ -1,7 +1,10 @@
 ---
 mathjax: include
-htmlTitle: FlinkML - Standard Scaler
-title: <a href="../ml">FlinkML</a> - Standard Scaler
+title: FlinkML - Standard Scaler
+# Sub navigation
+sub-nav-group: batch
+sub-nav-parent: flinkml
+sub-nav-title: Standard Scaler
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/libs/ml/svm.md b/docs/libs/ml/svm.md
index c34497981ba44..b149d31b7c143 100644
--- a/docs/libs/ml/svm.md
+++ b/docs/libs/ml/svm.md
@@ -1,7 +1,10 @@
 ---
 mathjax: include
-htmlTitle: FlinkML - SVM using CoCoA
-title: <a href="../ml">FlinkML</a> - SVM using CoCoA
+title: FlinkML - SVM using CoCoA
+# Sub navigation
+sub-nav-group: batch
+sub-nav-parent: flinkml
+sub-nav-title: SVM (CoCoA)
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/libs/table.md b/docs/libs/table.md
index 1aedce3d2a369..74ffd3cd94d87 100644
--- a/docs/libs/table.md
+++ b/docs/libs/table.md
@@ -1,6 +1,15 @@
 ---
 title: "Table API - Relational Queries"
 is_beta: true
+# Top navigation
+top-nav-group: libs
+top-nav-pos: 3
+top-nav-title: "Relational: Table"
+# Sub navigation
+sub-nav-group: batch
+sub-nav-parent: libs
+sub-nav-pos: 3
+sub-nav-title: Table
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -60,7 +69,7 @@ val result = expr.groupBy('word).select('word, 'count.sum as 'count).toDataSet[W
 The expression DSL uses Scala symbols to refer to field names and we use code generation to
 transform expressions to efficient runtime code. Please note that the conversion to and from
 Tables only works when using Scala case classes or Flink POJOs. Please check out
-the [programming guide]({{ site.baseurl }}/apis/programming_guide.html) to learn the requirements for a class to be
+the [programming guide]({{ site.baseurl }}/apis/index.html) to learn the requirements for a class to be
 considered a POJO.
 
 This is another example that shows how you
@@ -387,4 +396,3 @@ Here, `literal` is a valid Java literal and `field reference` specifies a column
 column names follow Java identifier syntax.
 
 Only the types `LONG` and `STRING` can be casted to `DATE` and vice versa. A `LONG` casted to `DATE` must be a milliseconds timestamp. A `STRING` casted to `DATE` must have the format "`yyyy-MM-dd HH:mm:ss.SSS`", "`yyyy-MM-dd`", "`HH:mm:ss`", or a milliseconds timestamp. By default, all timestamps refer to the UTC timezone beginning from January 1, 1970, 00:00:00 in milliseconds.
-
diff --git a/docs/page/css/flink.css b/docs/page/css/flink.css
index 3b09e54db0d8b..ed752c2c295b3 100644
--- a/docs/page/css/flink.css
+++ b/docs/page/css/flink.css
@@ -23,6 +23,7 @@ under the License.
 /* Padding at top because of the fixed navbar. */
 body {
 	padding-top: 70px;
+	padding-bottom: 50px;
 }
 
 /* Our logo. */
@@ -38,7 +39,10 @@ body {
 	color: black;
 	font-weight: bold;
 }
-.navbar-default .navbar-nav > li > a:hover {
+.navbar-default .navbar-nav > li > a:hover,
+.dropdown-menu > .active > a,
+.dropdown-menu > .active > a:hover {
+	color: #000000;
 	background: #E7E7E7;
 }
 
@@ -52,12 +56,14 @@ body {
 }
 
 /*=============================================================================
-                        Navbar at the side of the page
+ Per page TOC
 =============================================================================*/
 
 /* Move the side nav a little bit down to align with the main heading */
 #markdown-toc {
 	font-size: 90%;
+	padding-top: 16px;
+	padding-bottom: 16px;
 }
 
 /* Custom list styling */
@@ -71,17 +77,24 @@ body {
 
 /* All element */
 #markdown-toc li > a {
+	color: #000;
 	display: block;
 	padding: 5px 10px;
 	border: 1px solid #E5E5E5;
-	margin:-1px;
+	margin: -1px;
 }
 #markdown-toc li > a:hover,
 #markdown-toc li > a:focus {
-  text-decoration: none;
+  text-decoration: underline;
   background-color: #eee;
 }
 
+@media (min-width: 768px) {
+	#markdown-toc > li {
+		width: 400px;
+	}
+}
+
 /* 1st-level elements */
 #markdown-toc > li > a {
 	font-weight: bold;
@@ -101,31 +114,114 @@ body {
 	border-bottom: 1px solid #E5E5E5;
 }
 
+/*=============================================================================
+ Sub navigation (left side)
+=============================================================================*/
+
+/* Custom list styling */
+#sub-nav, #sub-nav ul {
+	list-style: none;
+	display: block;
+	position: relative;
+	padding-left: 0;
+	margin-bottom: 0;
+}
+
+/* All elements */
+#sub-nav li > a {
+	display: block;
+	padding: 5px 10px;
+	border: 1px solid #e5e5e5;
+	margin: -1px;
+	color: #000000;
+}
+
+#sub-nav li > a:hover,
+#sub-nav li > a:focus {
+	text-decoration: underline;
+	background-color: #e7e7e7;
+}
+
+/* 1st-level elements */
+#sub-nav > li > a {
+	background: #f8f8f8;
+	color: black;
+	font-weight: bold;
+}
+
+/* 2nd-level element */
+#sub-nav > li li > a {
+	padding-left: 20px; /* A little more indentation*/
+	background: #fff;
+}
+
+/* >= 3rd-level element */
+#sub-nav > li li li {
+	display: none; /* hide */
+}
+
+#sub-nav a.active {
+	background: #e7e7e7;
+}
+
+#sub-nav a.active:before {
+	content: '» ';
+}
+
 /*=============================================================================
                                     Text
 =============================================================================*/
 
-h2, h3 {
+h2 {
+	border-bottom: 1px solid #e5e5e5;
+}
+
+h2, h3, h4, h5, h6, h7 {
 	padding-top: 1em;
-	padding-bottom: 5px;
-	border-bottom: 1px solid #E5E5E5;
 }
 
 code {
-	background: #f5f5f5;
-	padding: 0;
-	color: #333333;
+	color: #000000;
+	background: #ffffff;
+	padding: 1px;
 	font-family: "Menlo", "Lucida Console", monospace;
 }
 
 pre {
-	font-size: 85%;
+	background: #f7f7f7;
+	border: none;
+	font-size: 14px;
+	font-family: "Menlo", "Lucida Console", monospace;
 }
 
 img.center {
 	display: block;
 	margin-left: auto;
-    margin-right: auto;
+	margin-right: auto;
+}
+
+a.top {
+	color: black;
+	text-decoration: none
+}
+
+.text {
+	font-size: 16px;
 }
 
+.beta {
+	font-weight: normal;
+	color: #333;
+}
+
+table {
+	width: 100%;
+}
 
+th {
+	text-align: center;
+}
+
+td {
+	padding: 5px;
+}
diff --git a/docs/quickstart/java_api_quickstart.md b/docs/quickstart/java_api_quickstart.md
index 4d9439610df03..ab1614dafd07c 100644
--- a/docs/quickstart/java_api_quickstart.md
+++ b/docs/quickstart/java_api_quickstart.md
@@ -1,5 +1,9 @@
 ---
 title: "Quickstart: Java API"
+# Top navigation
+top-nav-group: quickstart
+top-nav-pos: 3
+top-nav-title: Java API
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/quickstart/run_example_quickstart.md b/docs/quickstart/run_example_quickstart.md
index 449381210fbf9..5a02fed2c1742 100644
--- a/docs/quickstart/run_example_quickstart.md
+++ b/docs/quickstart/run_example_quickstart.md
@@ -1,5 +1,9 @@
 ---
 title: "Quick Start: Run K-Means Example"
+# Top navigation
+top-nav-group: quickstart
+top-nav-pos: 2
+top-nav-title: Run Example
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/quickstart/scala_api_quickstart.md b/docs/quickstart/scala_api_quickstart.md
index 6155e4124e636..28006c6d7d6c9 100644
--- a/docs/quickstart/scala_api_quickstart.md
+++ b/docs/quickstart/scala_api_quickstart.md
@@ -1,5 +1,9 @@
 ---
 title: "Quickstart: Scala API"
+# Top navigation
+top-nav-group: quickstart
+top-nav-pos: 4
+top-nav-title: Scala API
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/quickstart/setup_quickstart.md b/docs/quickstart/setup_quickstart.md
index 6fcb7296e493f..a9bc218929dfd 100644
--- a/docs/quickstart/setup_quickstart.md
+++ b/docs/quickstart/setup_quickstart.md
@@ -1,5 +1,9 @@
 ---
 title: "Quickstart: Setup"
+# Top navigation
+top-nav-group: quickstart
+top-nav-pos: 1
+top-nav-title: Setup
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/setup/building.md b/docs/setup/building.md
index 01600f95c92d0..8c1549e00ce50 100644
--- a/docs/setup/building.md
+++ b/docs/setup/building.md
@@ -1,5 +1,8 @@
 ---
-title:  "Build Flink"
+title: Building Flink
+top-nav-group: setup
+top-nav-pos: 1
+top-nav-title: Build Flink
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -20,8 +23,16 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-In order to build Flink, you need the source code. Either download the source of a release or clone the git repository. In addition to that, you need Maven 3 and a JDK (Java Development Kit).
-Flink requires at least Java 7 to build. We recommend using Java 8.
+This page covers how to build Flink {{ site.version }} from sources.
+
+* This will be replaced by the TOC
+{:toc}
+
+## Build Flink
+
+In order to build Flink you need the source code. Either [download the source of a release]({{ site.download_url }}) or [clone the git repository]({{ site.github_url }}).
+
+In addition you need **Maven 3** and a **JDK** (Java Development Kit). Flink requires **at least Java 7** to build. We recommend using Java 8.
 
 To clone from git, enter:
 
@@ -32,87 +43,85 @@ git clone {{ site.github_url }}
 The simplest way of building Flink is by running:
 
 ~~~bash
-cd flink
 mvn clean install -DskipTests
 ~~~
 
-This instructs Maven (`mvn`) to first remove all existing builds (`clean`) and then create a new Flink binary (`install`). The `-DskipTests` command prevents Maven from executing the unit tests. 
+This instructs [Maven](http://maven.apache.org) (`mvn`) to first remove all existing builds (`clean`) and then create a new Flink binary (`install`). The `-DskipTests` command prevents Maven from executing the tests.
 
-[Read more](http://maven.apache.org/) about Apache Maven.
+The default build includes the YARN Client for Hadoop 2.
 
+{% top %}
 
+## Hadoop Versions
 
-## Build Flink for a specific Hadoop Version
+{% info %} Most users do not need to do this manually. The [download page]({{ site.download_url }})  contains binary packages for common Hadoop versions.
 
-This section covers building Flink for a specific Hadoop version. Most users do not need to do this manually. The download page of Flink contains binary packages for common setups.
+Flink has dependencies to HDFS and YARN which are both dependencies from [Apache Hadoop](http://hadoop.apache.org). There exist many different versions of Hadoop (from both the upstream project and the different Hadoop distributions). If you are using a wrong combination of versions, exceptions can occur.
 
-The problem is that Flink uses HDFS and YARN which are both dependencies from Apache Hadoop. There exist many different versions of Hadoop (from both the upstream project and the different Hadoop distributions). If a user is using a wrong combination of versions, exceptions like this one occur:
+There are two main versions of Hadoop that we need to differentiate:
+- **Hadoop 1**, with all versions starting with zero or one, like *0.20*, *0.23* or *1.2.1*.
+- **Hadoop 2**, with all versions starting with 2, like *2.6.0*.
 
-~~~bash
-ERROR: Job execution failed.
-    org.apache.flink.runtime.client.JobExecutionException: Cannot initialize task 'TextInputFormat(/my/path)':
-    java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException:
-    Protocol message contained an invalid tag (zero).; Host Details :
-~~~
+The main differentiation between Hadoop 1 and Hadoop 2 is the availability of [Hadoop YARN](https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html), Hadoop's cluster resource manager.
 
-There are two main versions of Hadoop that we need to differentiate:
-- Hadoop 1, with all versions starting with zero or one, like 0.20, 0.23 or 1.2.1.
-- Hadoop 2, with all versions starting with 2, like 2.2.0.
-The main differentiation between Hadoop 1 and Hadoop 2 is the availability of Hadoop YARN (Hadoops cluster resource manager).
+**By default, Flink is using the Hadoop 2 dependencies**.
 
-By default, Flink is using the Hadoop 2 dependencies.
+### Hadoop 1
 
-**To build Flink for Hadoop 1**, issue the following command:
+To build Flink for Hadoop 1, issue the following command:
 
 ~~~bash
 mvn clean install -DskipTests -Dhadoop.profile=1
 ~~~
 
-The `-Dhadoop.profile=1` flag instructs Maven to build Flink for Hadoop 1. Note that the features included in Flink change when using a different Hadoop profile. In particular the support for YARN and the build-in HBase support are not available in Hadoop 1 builds.
+The `-Dhadoop.profile=1` flag instructs Maven to build Flink for Hadoop 1. Note that the features included in Flink change when using a different Hadoop profile. In particular, there is no support for YARN and HBase in Hadoop 1 builds.
 
+### Hadoop 2
 
-You can also **specify a specific Hadoop version to build against**:
+You can also specify a specific Hadoop version to build against:
 
 ~~~bash
 mvn clean install -DskipTests -Dhadoop.version=2.4.1
 ~~~
 
+#### Before Hadoop 2.2.0
 
-**To build Flink against a vendor specific Hadoop version**, issue the following command:
+Maven will automatically build Flink with its YARN client. The 2.2.0 Hadoop release is *not* supported by Flink's YARN client. Therefore, you need to exclude the YARN client with the following string: `-P!include-yarn`.
+
+So if you are building Flink for Hadoop `2.0.0-alpha`, use the following command:
 
 ~~~bash
-mvn clean install -DskipTests -Pvendor-repos -Dhadoop.version=2.2.0-cdh5.0.0-beta-2
+mvn clean install -P!include-yarn -Dhadoop.version=2.0.0-alpha
 ~~~
 
-The `-Pvendor-repos` activates a Maven [build profile](http://maven.apache.org/guides/introduction/introduction-to-profiles.html) that includes the repositories of popular Hadoop vendors such as Cloudera, Hortonworks, or MapR.
-
-**Build Flink for `hadoop2` versions before 2.2.0**
+### Vendor-specific Versions
 
-Maven will automatically build Flink with its YARN client. But there were some changes in Hadoop versions before the 2.2.0 Hadoop release that are not supported by Flink's YARN client. Therefore, you can disable building the YARN client with the following string: `-P!include-yarn`. 
-
-So if you are building Flink for Hadoop `2.0.0-alpha`, use the following command:
+To build Flink against a vendor specific Hadoop version, issue the following command:
 
 ~~~bash
--P!include-yarn -Dhadoop.version=2.0.0-alpha
+mvn clean install -DskipTests -Pvendor-repos -Dhadoop.version=2.2.0-cdh5.0.0-beta-2
 ~~~
 
-## Build Flink for a specific Scala Version
+The `-Pvendor-repos` activates a Maven [build profile](http://maven.apache.org/guides/introduction/introduction-to-profiles.html) that includes the repositories of popular Hadoop vendors such as Cloudera, Hortonworks, or MapR.
+
+{% top %}
 
-**Note:** Users that purely use the Java APIs and libraries can ignore this section.
+## Scala Versions
 
-Flink has APIs, libraries, and runtime modules written in [Scala](http://scala-lang.org). Users of the Scala API and libraries may have to match the Scala version of Flink with the Scala version
-of their projects (because Scala is not strictly backwards compatible).
+{% info %} Users that purely use the Java APIs and libraries can *ignore* this section.
 
-By default, Flink is built with the Scala *2.10*. To build Flink with Scala *2.11*, you need to change the default Scala *binary version* with a build script:
+Flink has APIs, libraries, and runtime modules written in [Scala](http://scala-lang.org). Users of the Scala API and libraries may have to match the Scala version of Flink with the Scala version of their projects (because Scala is not strictly backwards compatible).
+
+**By default, Flink is built with the Scala 2.10**. To build Flink with Scala *2.11*, you can change the default Scala *binary version* with the following script:
 
 ~~~bash
 # Switch Scala binary version between 2.10 and 2.11
 tools/change-scala-version.sh 2.11
-# Build and install locally
+# Build with Scala version 2.11
 mvn clean install -DskipTests
 ~~~
 
-To build against custom Scala versions, you need to switch to the appropriate binary version and supply the *language version* as additional build property. For example, to buid against Scala 2.11.4, you have to execute:
+To build against custom Scala versions, you need to switch to the appropriate binary version and supply the *language version* as an additional build property. For example, to build against Scala 2.11.4, you have to execute:
 
 ~~~bash
 # Switch Scala binary version to 2.11
@@ -125,12 +134,11 @@ Flink is developed against Scala *2.10* and tested additionally against Scala *2
 
 Newer versions may be compatible, depending on breaking changes in the language features used by Flink, and the availability of Flink's dependencies in those Scala versions. The dependencies written in Scala include for example *Kafka*, *Akka*, *Scalatest*, and *scopt*.
 
+{% top %}
 
-## Building in encrypted filesystems
+## Encrypted File Systems
 
-If your home directory is encrypted you might encounter a `java.io.IOException: File 
-name too long` exception. Some encrypted file systems, like encfs used by Ubuntu, do not allow
-long filenames, which is the cause of this error.
+If your home directory is encrypted you might encounter a `java.io.IOException: File name too long` exception. Some encrypted file systems, like encfs used by Ubuntu, do not allow long filenames, which is the cause of this error.
 
 The workaround is to add:
 
@@ -141,18 +149,16 @@ The workaround is to add:
 </args>
 ~~~
 
-in the compiler configuration of the `pom.xml` file of the module causing the error. For example,
-if the error appears in the `flink-yarn` module, the above code should 
-be added under the `<configuration>` tag of `scala-maven-plugin`. See 
-[this issue](https://issues.apache.org/jira/browse/FLINK-2003) for more information.
+in the compiler configuration of the `pom.xml` file of the module causing the error. For example, if the error appears in the `flink-yarn` module, the above code should be added under the `<configuration>` tag of `scala-maven-plugin`. See [this issue](https://issues.apache.org/jira/browse/FLINK-2003) for more information.
+
+{% top %}
 
-## Background
+## Internals
 
-The builds with Maven are controlled by [properties](http://maven.apache.org/pom.html#Properties) and <a href="http://maven.apache.org/guides/introduction/introduction-to-profiles.html">build profiles</a>.
-There are two profiles, one for hadoop1 and one for hadoop2. When the hadoop2 profile is enabled (default), the system will also build the YARN client.
+The builds with Maven are controlled by [properties](http://maven.apache.org/pom.html#Properties) and [build profiles](http://maven.apache.org/guides/introduction/introduction-to-profiles.html). There are two profiles, one for `hadoop1` and one for `hadoop2`. When the `hadoop2` profile is enabled (default), the system will also build the YARN client.
 
-To enable the hadoop1 profile, set `-Dhadoop.profile=1` when building.
-Depending on the profile, there are two Hadoop versions, set via properties. For "hadoop1", we use 1.2.1 by default, for "hadoop2" it is 2.2.0.
+To enable the `hadoop1` profile, set `-Dhadoop.profile=1` when building. Depending on the profile, there are two Hadoop versions, set via properties. For `hadoop1`, we use 1.2.1 by default, for `hadoop2` it is 2.3.0.
 
 You can change these versions with the `hadoop-two.version` (or `hadoop-one.version`) property. For example `-Dhadoop-two.version=2.4.0`.
 
+{% top %}
diff --git a/docs/setup/cluster_setup.md b/docs/setup/cluster_setup.md
index 3ee6630f4dd7b..8ff96d7326402 100644
--- a/docs/setup/cluster_setup.md
+++ b/docs/setup/cluster_setup.md
@@ -1,5 +1,8 @@
 ---
 title:  "Cluster Setup"
+top-nav-group: deployment
+top-nav-title: Cluster (Standalone)
+top-nav-pos: 2
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -20,177 +23,53 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-This documentation is intended to provide instructions on how to run
-Flink in a fully distributed fashion on a static (but possibly
-heterogeneous) cluster.
-
-This involves two steps. First, installing and configuring Flink and
-second installing and configuring the [Hadoop Distributed
-Filesystem](http://hadoop.apache.org/) (HDFS).
+This page provides instructions on how to run Flink in a *fully distributed fashion* on a *static* (but possibly heterogeneous) cluster.
 
 * This will be replaced by the TOC
 {:toc}
 
-## Preparing the Cluster
+## Requirements
 
 ### Software Requirements
 
-Flink runs on all *UNIX-like environments*, e.g. **Linux**, **Mac OS X**,
-and **Cygwin** (for Windows) and expects the cluster to consist of **one master
-node** and **one or more worker nodes**. Before you start to setup the system,
-make sure you have the following software installed **on each node**:
+Flink runs on all *UNIX-like environments*, e.g. **Linux**, **Mac OS X**, and **Cygwin** (for Windows) and expects the cluster to consist of **one master node** and **one or more worker nodes**. Before you start to setup the system, make sure you have the following software installed **on each node**:
 
 - **Java 1.7.x** or higher,
 - **ssh** (sshd must be running to use the Flink scripts that manage
   remote components)
 
-If your cluster does not fulfill these software requirements you will need to
-install/upgrade it.
-
-For example, on Ubuntu Linux, type in the following commands to install Java and
-ssh:
-
-~~~bash
-sudo apt-get install ssh
-sudo apt-get install openjdk-7-jre
-~~~
-
-You can check the correct installation of Java by issuing the following command:
-
-~~~bash
-java -version
-~~~
-
-The command should output something comparable to the following on every node of
-your cluster (depending on your Java version, there may be small differences):
-
-~~~bash
-java version "1.7.0_55"
-Java(TM) SE Runtime Environment (build 1.7.0_55-b13)
-Java HotSpot(TM) 64-Bit Server VM (build 24.55-b03, mixed mode)
-~~~
-
-To make sure the ssh daemon is running properly, you can use the command
-
-~~~bash
-ps aux | grep sshd
-~~~
-
-Something comparable to the following line should appear in the output
-of the command on every host of your cluster:
-
-~~~bash
-root       894  0.0  0.0  49260   320 ?        Ss   Jan09   0:13 /usr/sbin/sshd
-~~~
-
-### Configuring Remote Access with ssh
-
-In order to start/stop the remote processes, the master node requires access via
-ssh to the worker nodes. It is most convenient to use ssh's public key
-authentication for this. To setup public key authentication, log on to the
-master as the user who will later execute all the Flink components. **The
-same user (i.e. a user with the same user name) must also exist on all worker
-nodes**. For the remainder of this instruction we will refer to this user as
-*flink*. Using the super user *root* is highly discouraged for security
-reasons.
-
-Once you logged in to the master node as the desired user, you must generate a
-new public/private key pair. The following command will create a new
-public/private key pair into the *.ssh* directory inside the home directory of
-the user *flink*. See the ssh-keygen man page for more details. Note that
-the private key is not protected by a passphrase.
-
-~~~bash
-ssh-keygen -b 2048 -P '' -f ~/.ssh/id_rsa
-~~~
-
-Next, copy/append the content of the file *.ssh/id_rsa.pub* to your
-authorized_keys file. The content of the authorized_keys file defines which
-public keys are considered trustworthy during the public key authentication
-process. On most systems the appropriate command is
-
-~~~bash
-cat .ssh/id_rsa.pub >> .ssh/authorized_keys
-~~~
-
-On some Linux systems, the authorized keys file may also be expected by the ssh
-daemon under *.ssh/authorized_keys2*. In either case, you should make sure the
-file only contains those public keys which you consider trustworthy for each
-node of cluster.
+If your cluster does not fulfill these software requirements you will need to install/upgrade it.
 
-Finally, the authorized keys file must be copied to every worker node of your
-cluster. You can do this by repeatedly typing in
-
-~~~bash
-scp .ssh/authorized_keys <worker>:~/.ssh/
-~~~
-
-and replacing *\<worker\>* with the host name of the respective worker node.
-After having finished the copy process, you should be able to log on to each
-worker node from your master node via ssh without a password.
-
-### Setting JAVA_HOME on each Node
-
-Flink requires the `JAVA_HOME` environment variable to be set on the
-master and all worker nodes and point to the directory of your Java
-installation.
-
-You can set this variable in `conf/flink-conf.yaml` via the
-`env.java.home` key.
-
-Alternatively, add the following line to your shell profile. If you use the
-*bash* shell (probably the most common shell), the shell profile is located in
-*\~/.bashrc*:
-
-~~~bash
-export JAVA_HOME=/path/to/java_home/
-~~~
+{% top %}
 
-If your ssh daemon supports user environments, you can also add `JAVA_HOME` to
-*.\~/.ssh/environment*. As super user *root* you can enable ssh user
-environments with the following commands:
+### `JAVA_HOME` Configuration
 
-~~~bash
-echo "PermitUserEnvironment yes" >> /etc/ssh/sshd_config
-/etc/init.d/ssh restart
+Flink requires the `JAVA_HOME` environment variable to be set on the master and all worker nodes and point to the directory of your Java installation.
 
-# on some system you might need to replace the above line with
-/etc/init.d/sshd restart
-~~~
+You can set this variable in `conf/flink-conf.yaml` via the `env.java.home` key.
 
+{% top %}
 
 ## Flink Setup
 
-Go to the [downloads page]({{site.baseurl}}/downloads.html) and get the ready to run
-package. Make sure to pick the Flink package **matching your Hadoop
-version**.
+Go to the [downloads page]({{ site.download_url }}) and get the ready to run package. Make sure to pick the Flink package **matching your Hadoop version**. If you don't plan to use Hadoop, pick any version.
 
-After downloading the latest release, copy the archive to your master node and
-extract it:
+After downloading the latest release, copy the archive to your master node and extract it:
 
 ~~~bash
 tar xzf flink-*.tgz
 cd flink-*
 ~~~
 
-### Configuring the Cluster
+### Configuring Flink
 
-After having extracted the system files, you need to configure Flink for
-the cluster by editing *conf/flink-conf.yaml*.
+After having extracted the system files, you need to configure Flink for the cluster by editing *conf/flink-conf.yaml*.
 
-Set the `jobmanager.rpc.address` key to point to your master node. Furthermode
-define the maximum amount of main memory the JVM is allowed to allocate on each
-node by setting the `jobmanager.heap.mb` and `taskmanager.heap.mb` keys.
+Set the `jobmanager.rpc.address` key to point to your master node. Furthermode define the maximum amount of main memory the JVM is allowed to allocate on each node by setting the `jobmanager.heap.mb` and `taskmanager.heap.mb` keys.
 
-The value is given in MB. If some worker nodes have more main memory which you
-want to allocate to the Flink system you can overwrite the default value
-by setting an environment variable `FLINK_TM_HEAP` on the respective
-node.
+The value is given in MB. If some worker nodes have more main memory which you want to allocate to the Flink system you can overwrite the default value by setting an environment variable `FLINK_TM_HEAP` on the respective node.
 
-Finally you must provide a list of all nodes in your cluster which shall be used
-as worker nodes. Therefore, similar to the HDFS configuration, edit the file
-*conf/slaves* and enter the IP/host name of each worker node. Each worker node
-will later run a TaskManager.
+Finally you must provide a list of all nodes in your cluster which shall be used as worker nodes. Therefore, similar to the HDFS configuration, edit the file *conf/slaves* and enter the IP/host name of each worker node. Each worker node will later run a TaskManager.
 
 Each entry must be separated by a new line, as in the following example:
 
@@ -203,12 +82,9 @@ Each entry must be separated by a new line, as in the following example:
 192.168.0.150
 ~~~
 
-The Flink directory must be available on every worker under the same
-path. Similarly as for HDFS, you can use a shared NSF directory, or copy the
-entire Flink directory to every worker node.
+The Flink directory must be available on every worker under the same path. You can use a shared NSF directory, or copy the entire Flink directory to every worker node.
 
-Please see the [configuration page](config.html) for details and additional
-configuration options.
+Please see the [configuration page](config.html) for details and additional configuration options.
 
 In particular,
 
@@ -219,14 +95,11 @@ In particular,
 
 are very important configuration values.
 
+{% top %}
 
 ### Starting Flink
 
-The following script starts a JobManager on the local node and connects via
-SSH to all worker nodes listed in the *slaves* file to start the
-TaskManager on each node. Now your Flink system is up and
-running. The JobManager running on the local node will now accept jobs
-at the configured RPC port.
+The following script starts a JobManager on the local node and connects via SSH to all worker nodes listed in the *slaves* file to start the TaskManager on each node. Now your Flink system is up and running. The JobManager running on the local node will now accept jobs at the configured RPC port.
 
 Assuming that you are on the master node and inside the Flink directory:
 
@@ -236,136 +109,24 @@ bin/start-cluster.sh
 
 To stop Flink, there is also a `stop-cluster.sh` script.
 
-### Optional: Adding JobManager/TaskManager instances to a cluster
+{% top %}
 
-You can add both TaskManager or JobManager instances to your running cluster with the `bin/taskmanager.sh` and `bin/jobmanager.sh` scripts.
+### Adding JobManager/TaskManager Instances to a Cluster
 
-#### Adding a TaskManager
-<pre>
-bin/taskmanager.sh start|stop|stop-all
-</pre>
+You can add both JobManager and TaskManager instances to your running cluster with the `bin/taskmanager.sh` and `bin/jobmanager.sh` scripts.
 
 #### Adding a JobManager
-<pre>
-bin/jobmanager.sh (start (local|cluster))|stop|stop-all
-</pre>
-
-Make sure to call these scripts on the hosts, on which you want to start/stop the respective instance.
-
-
-## Optional: Hadoop Distributed Filesystem (HDFS) Setup
-
-**NOTE** Flink does not require HDFS to run; HDFS is simply a typical choice of a distributed data
-store to read data from (in parallel) and write results to.
-If HDFS is already available on the cluster, or Flink is used purely with different storage
-techniques (e.g., Apache Kafka, JDBC, Rabbit MQ, or other storage or message queues), this
-setup step is not needed.
-
-
-The following instructions are a general overview of usual required settings. Please consult one of the
-many installation guides available online for more detailed instructions.
-
-__Note that the following instructions are based on Hadoop 1.2 and might differ
-for Hadoop 2.__
-
-### Downloading, Installing, and Configuring HDFS
-
-Similar to the Flink system HDFS runs in a distributed fashion. HDFS
-consists of a **NameNode** which manages the distributed file system's meta
-data. The actual data is stored by one or more **DataNodes**. For the remainder
-of this instruction we assume the HDFS's NameNode component runs on the master
-node while all the worker nodes run an HDFS DataNode.
-
-To start, log on to your master node and download Hadoop (which includes  HDFS)
-from the Apache [Hadoop Releases](http://hadoop.apache.org/releases.html) page.
-
-Next, extract the Hadoop archive.
-
-After having extracted the Hadoop archive, change into the Hadoop directory and
-edit the Hadoop environment configuration file:
-
-~~~bash
-cd hadoop-*
-vi conf/hadoop-env.sh
-~~~
-
-Uncomment and modify the following line in the file according to the path of
-your Java installation.
-
-~~~
-export JAVA_HOME=/path/to/java_home/
-~~~
-
-Save the changes and open the HDFS configuration file *conf/hdfs-site.xml*. HDFS
-offers multiple configuration parameters which affect the behavior of the
-distributed file system in various ways. The following excerpt shows a minimal
-configuration which is required to make HDFS work. More information on how to
-configure HDFS can be found in the [HDFS User
-Guide](http://hadoop.apache.org/docs/r1.2.1/hdfs_user_guide.html) guide.
-
-~~~xml
-<configuration>
-  <property>
-    <name>fs.default.name</name>
-    <value>hdfs://MASTER:50040/</value>
-  </property>
-  <property>
-    <name>dfs.data.dir</name>
-    <value>DATAPATH</value>
-  </property>
-</configuration>
-~~~
-
-Replace *MASTER* with the IP/host name of your master node which runs the
-*NameNode*. *DATAPATH* must be replaced with path to the directory in which the
-actual HDFS data shall be stored on each worker node. Make sure that the
-*flink* user has sufficient permissions to read and write in that
-directory.
-
-After having saved the HDFS configuration file, open the file *conf/slaves* and
-enter the IP/host name of those worker nodes which shall act as *DataNode*s.
-Each entry must be separated by a line break.
-
-~~~
-<worker 1>
-<worker 2>
-.
-.
-.
-<worker n>
-~~~
-
-Initialize the HDFS by typing in the following command. Note that the
-command will **delete all data** which has been previously stored in the
-HDFS. However, since we have just installed a fresh HDFS, it should be
-safe to answer the confirmation with *yes*.
 
 ~~~bash
-bin/hadoop namenode -format
+bin/jobmanager.sh (start cluster)|stop|stop-all
 ~~~
 
-Finally, we need to make sure that the Hadoop directory is available to
-all worker nodes which are intended to act as DataNodes and that all nodes
-**find the directory under the same path**. We recommend to use a shared network
-directory (e.g. an NFS share) for that. Alternatively, one can copy the
-directory to all nodes (with the disadvantage that all configuration and
-code updates need to be synced to all nodes).
-
-### Starting HDFS
-
-To start the HDFS log on to the master and type in the following
-commands
+#### Adding a TaskManager
 
 ~~~bash
-cd hadoop-*
-bin/start-dfs.sh
+bin/taskmanager.sh start|stop|stop-all
 ~~~
 
-If your HDFS setup is correct, you should be able to open the HDFS
-status website at *http://MASTER:50070*. In a matter of a seconds,
-all DataNodes should appear as live nodes. For troubleshooting we would
-like to point you to the [Hadoop Quick
-Start](http://wiki.apache.org/hadoop/QuickStart)
-guide.
-
+Make sure to call these scripts on the hosts, on which you want to start/stop the respective instance.
 
+{% top %}
diff --git a/docs/setup/config.md b/docs/setup/config.md
index a8aba4912c2b7..3f476d8ec22c1 100644
--- a/docs/setup/config.md
+++ b/docs/setup/config.md
@@ -1,5 +1,7 @@
 ---
 title:  "Configuration"
+top-nav-group: setup
+top-nav-pos: 2
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -20,295 +22,122 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-## Overview
+The default configuration parameters allow Flink to run out-of-the-box in single node setups.
 
-The default configuration parameters allow Flink to run out-of-the-box
-in single node setups.
+This page lists the most common options that are typically needed to set up a well performing (distributed) installation. In addition a full list of all available configuration parameters is listed here.
 
-This page lists the most common options that are typically needed to set
-up a well performing (distributed) installation. In addition a full
-list of all available configuration parameters is listed here.
+All configuration is done in `conf/flink-conf.yaml`, which is expected to be a flat collection of [YAML key value pairs](http://www.yaml.org/spec/1.2/spec.html) with format `key: value`.
 
-All configuration is done in `conf/flink-conf.yaml`, which is expected to be
-a flat collection of [YAML key value pairs](http://www.yaml.org/spec/1.2/spec.html)
-with format `key: value`.
-
-The system and run scripts parse the config at startup time. Changes to the configuration
-file require restarting the Flink JobManager and TaskManagers.
-
-The configuration files for the TaskManagers can be different, Flink does not assume
-uniform machines in the cluster.
+The system and run scripts parse the config at startup time. Changes to the configuration file require restarting the Flink JobManager and TaskManagers.
 
+The configuration files for the TaskManagers can be different, Flink does not assume uniform machines in the cluster.
 
 * This will be replaced by the TOC
 {:toc}
 
-
 ## Common Options
 
-- `env.java.home`: The path to the Java installation to use (DEFAULT: system's
-default Java installation, if found). Needs to be specified if the startup
-scipts fail to automatically resolve the java home directory. Can be specified
-to point to a specific java installation or version. If this option is not
-specified, the startup scripts also evaluate the `$JAVA_HOME` environment variable.
+- `env.java.home`: The path to the Java installation to use (DEFAULT: system's default Java installation, if found). Needs to be specified if the startup scripts fail to automatically resolve the java home directory. Can be specified to point to a specific java installation or version. If this option is not specified, the startup scripts also evaluate the `$JAVA_HOME` environment variable.
 
-- `jobmanager.rpc.address`: The IP address of the JobManager, which is the
-master/coordinator of the distributed system (DEFAULT: localhost).
+- `jobmanager.rpc.address`: The IP address of the JobManager, which is the master/coordinator of the distributed system (DEFAULT: localhost).
 
 - `jobmanager.rpc.port`: The port number of the JobManager (DEFAULT: 6123).
 
-- `jobmanager.heap.mb`: JVM heap size (in megabytes) for the JobManager. You may have to increase the heap size for the JobManager if you are running
-very large applications (with many operators), or if you are keeping a long history of them.
-
-- `taskmanager.heap.mb`: JVM heap size (in megabytes) for the TaskManagers,
-which are the parallel workers of the system. In
-contrast to Hadoop, Flink runs operators (e.g., join, aggregate) and
-user-defined functions (e.g., Map, Reduce, CoGroup) inside the TaskManager
-(including sorting/hashing/caching), so this value should be as
-large as possible. If the cluster is exclusively running Flink,
-the total amount of available memory per machine minus some memory for the
-operating system (maybe 1-2 GB) is a good value.
-On YARN setups, this value is automatically configured to the size of
-the TaskManager's YARN container, minus a certain tolerance value.
-
-- `taskmanager.numberOfTaskSlots`: The number of parallel operator or
-user function instances that a single TaskManager can run (DEFAULT: 1).
-If this value is larger than 1, a single TaskManager takes multiple instances of
-a function or operator. That way, the TaskManager can utilize multiple CPU cores,
-but at the same time, the available memory is divided between the different
-operator or function instances.
-This value is typically proportional to the number of physical CPU cores that
-the TaskManager's machine has (e.g., equal to the number of cores, or half the
-number of cores). [More about task slots](config.html#configuring-taskmanager-processing-slots).
-
-- `parallelism.default`: The default parallelism to use for programs that have
-no parallelism specified. (DEFAULT: 1). For setups that have no concurrent jobs
-running, setting this value to NumTaskManagers * NumSlotsPerTaskManager will
-cause the system to use all available execution resources for the program's
-execution. **Note**: The default parallelism can be overwriten for an entire
-job by calling `setParallelism(int parallelism)` on the `ExecutionEnvironment`
-or by passing `-p <parallelism>` to the Flink Command-line frontend. It can be
-overwritten for single transformations by calling `setParallelism(int
-parallelism)` on an operator. See the [programming
-guide]({{site.baseurl}}/apis/programming_guide.html#parallel-execution) for more information about the
-parallelism.
-
-- `fs.hdfs.hadoopconf`: The absolute path to the Hadoop File System's (HDFS)
-configuration **directory** (OPTIONAL VALUE).
-Specifying this value allows programs to reference HDFS files using short URIs
-(`hdfs:///path/to/files`, without including the address and port of the NameNode
-in the file URI). Without this option, HDFS files can be accessed, but require
-fully qualified URIs like `hdfs://address:port/path/to/files`.
-This option also causes file writers to pick up the HDFS's default values for block sizes
-and replication factors. Flink will look for the "core-site.xml" and
-"hdfs-site.xml" files in teh specified directory.
+- `jobmanager.heap.mb`: JVM heap size (in megabytes) for the JobManager. You may have to increase the heap size for the JobManager if you are running very large applications (with many operators), or if you are keeping a long history of them.
+
+- `taskmanager.heap.mb`: JVM heap size (in megabytes) for the TaskManagers, which are the parallel workers of the system. In contrast to Hadoop, Flink runs operators (e.g., join, aggregate) and user-defined functions (e.g., Map, Reduce, CoGroup) inside the TaskManager (including sorting/hashing/caching), so this value should be as large as possible. If the cluster is exclusively running Flink, the total amount of available memory per machine minus some memory for the operating system (maybe 1-2 GB) is a good value. On YARN setups, this value is automatically configured to the size of the TaskManager's YARN container, minus a certain tolerance value.
+
+- `taskmanager.numberOfTaskSlots`: The number of parallel operator or user function instances that a single TaskManager can run (DEFAULT: 1). If this value is larger than 1, a single TaskManager takes multiple instances of a function or operator. That way, the TaskManager can utilize multiple CPU cores, but at the same time, the available memory is divided between the different operator or function instances. This value is typically proportional to the number of physical CPU cores that the TaskManager's machine has (e.g., equal to the number of cores, or half the number of cores). [More about task slots](config.html#configuring-taskmanager-processing-slots).
+
+- `parallelism.default`: The default parallelism to use for programs that have no parallelism specified. (DEFAULT: 1). For setups that have no concurrent jobs running, setting this value to NumTaskManagers * NumSlotsPerTaskManager will cause the system to use all available execution resources for the program's execution. **Note**: The default parallelism can be overwriten for an entire job by calling `setParallelism(int parallelism)` on the `ExecutionEnvironment` or by passing `-p <parallelism>` to the Flink Command-line frontend. It can be overwritten for single transformations by calling `setParallelism(int
+parallelism)` on an operator. See the [programming guide]({{site.baseurl}}/apis/programming_guide.html#parallel-execution) for more information about the parallelism.
 
+- `fs.hdfs.hadoopconf`: The absolute path to the Hadoop File System's (HDFS) configuration **directory** (OPTIONAL VALUE). Specifying this value allows programs to reference HDFS files using short URIs (`hdfs:///path/to/files`, without including the address and port of the NameNode in the file URI). Without this option, HDFS files can be accessed, but require fully qualified URIs like `hdfs://address:port/path/to/files`. This option also causes file writers to pick up the HDFS's default values for block sizes and replication factors. Flink will look for the "core-site.xml" and "hdfs-site.xml" files in teh specified directory.
 
 ## Advanced Options
 
 ### Managed Memory
 
-By default, Flink allocates a fraction of 0.7 of the total memory configured via
-`taskmanager.heap.mb` for its managed memory. Managed memory helps Flink to run
-the operators efficiently. It prevents OutOfMemoryExceptions because Flink knows
-how much memory it can use to execute operations. If Flink runs out of managed
-memory, it utilizes disk space. Using managed memory, some operations can be
-performed directly on the raw data without having to deserialize the data to
-convert it into Java objects. All in all, managed memory improves the robustness
-and speed of the system.
-
-The default fraction for managed memory can be adjusted using the
-`taskmanager.memory.fraction` parameter. An absolute value may be set using
-`taskmanager.memory.size` (overrides the fraction parameter). If desired, the
-managed memory may be allocated outside the JVM heap. This may improve
-performance in setups with large memory sizes.
-
-- `taskmanager.memory.size`: The amount of memory (in megabytes) that the task
-manager reserves on the JVM's heap space for sorting, hash tables, and caching
-of intermediate results. If unspecified (-1), the memory manager will take a fixed
-ratio of the heap memory available to the JVM, as specified by
-`taskmanager.memory.fraction`. (DEFAULT: -1)
-
-- `taskmanager.memory.fraction`: The relative amount of memory that the task
-manager reserves for sorting, hash tables, and caching of intermediate results.
-For example, a value of 0.8 means that TaskManagers reserve 80% of the
-JVM's heap space for internal data buffers, leaving 20% of the JVM's heap space
-free for objects created by user-defined functions. (DEFAULT: 0.7)
-This parameter is only evaluated, if `taskmanager.memory.size` is not set.
-
-- `taskmanager.memory.off-heap`: If set to `true`, the task manager allocates
-memory which is used for sorting, hash tables, and caching of intermediate
-results outside of the JVM heap. For setups with larger quantities of memory,
-this can improve the efficiency of the operations performed on the memory
-(DEFAULT: false).
-
-- `taskmanager.memory.segment-size`: The size of memory buffers used by the
-memory manager and the network stack in bytes (DEFAULT: 32768 (= 32 KiBytes)).
-
-- `taskmanager.memory.preallocate`: Can be either of `true` or `false`. Specifies whether task
-managers should allocate all managed memory when starting up. (DEFAULT: false)
+By default, Flink allocates a fraction of 0.7 of the total memory configured via `taskmanager.heap.mb` for its managed memory. Managed memory helps Flink to run the operators efficiently. It prevents OutOfMemoryExceptions because Flink knows how much memory it can use to execute operations. If Flink runs out of managed memory, it utilizes disk space. Using managed memory, some operations can be performed directly on the raw data without having to deserialize the data to convert it into Java objects. All in all, managed memory improves the robustness and speed of the system.
 
+The default fraction for managed memory can be adjusted using the `taskmanager.memory.fraction` parameter. An absolute value may be set using `taskmanager.memory.size` (overrides the fraction parameter). If desired, the managed memory may be allocated outside the JVM heap. This may improve performance in setups with large memory sizes.
 
-### Kerberos
+- `taskmanager.memory.size`: The amount of memory (in megabytes) that the task manager reserves on the JVM's heap space for sorting, hash tables, and caching of intermediate results. If unspecified (-1), the memory manager will take a fixed ratio of the heap memory available to the JVM, as specified by `taskmanager.memory.fraction`. (DEFAULT: -1)
 
-Flink supports Kerberos authentication of Hadoop services such as HDFS, YARN,
-or HBase.
+- `taskmanager.memory.fraction`: The relative amount of memory that the task manager reserves for sorting, hash tables, and caching of intermediate results. For example, a value of 0.8 means that TaskManagers reserve 80% of the JVM's heap space for internal data buffers, leaving 20% of the JVM's heap space free for objects created by user-defined functions. (DEFAULT: 0.7) This parameter is only evaluated, if `taskmanager.memory.size` is not set.
 
-While Hadoop uses Kerberos tickets to authenticate users with services
-initially, the authentication process continues differently afterwards. Instead
-of saving the ticket to authenticate on a later access, Hadoop creates its own
-security tockens (DelegationToken) that it passes around. These are
-authenticated to Kerberos periodically but are independent of the token renewal
-time. The tokens have a maximum life span identical to the Kerberos ticket maximum life
-span.
+- `taskmanager.memory.off-heap`: If set to `true`, the task manager allocates memory which is used for sorting, hash tables, and caching of intermediate results outside of the JVM heap. For setups with larger quantities of memory, this can improve the efficiency of the operations performed on the memory (DEFAULT: false).
 
-Please make sure to set the maximum ticket life span high long running
-jobs. The renewal time of the ticket, on the other hand, is not important
-because Hadoop abstracts this away using its own security tocken renewal
-system. Hadoop makes sure that tickets are renewed in time and you can be sure
-to be authenticated until the end of the ticket life time.
+- `taskmanager.memory.segment-size`: The size of memory buffers used by the memory manager and the network stack in bytes (DEFAULT: 32768 (= 32 KiBytes)).
 
-If you are on YARN, then it is sufficient to authenticate the client with
-Kerberos. On a Flink standalone cluster you need to ensure that, initially, all
-nodes are authenticated with Kerberos using the `kinit` tool.
+- `taskmanager.memory.preallocate`: Can be either of `true` or `false`. Specifies whether task managers should allocate all managed memory when starting up. (DEFAULT: false)
 
+### Kerberos
 
-### Other
+Flink supports Kerberos authentication of Hadoop services such as HDFS, YARN, or HBase.
 
-- `taskmanager.tmp.dirs`: The directory for temporary files, or a list of
-directories separated by the systems directory delimiter (for example ':'
-(colon) on Linux/Unix). If multiple directories are specified, then the temporary
-files will be distributed across the directories in a round-robin fashion. The
-I/O manager component will spawn one reading and one writing thread per
-directory. A directory may be listed multiple times to have the I/O manager use
-multiple threads for it (for example if it is physically stored on a very fast
-disc or RAID) (DEFAULT: The system's tmp dir).
+While Hadoop uses Kerberos tickets to authenticate users with services initially, the authentication process continues differently afterwards. Instead of saving the ticket to authenticate on a later access, Hadoop creates its own security tockens (DelegationToken) that it passes around. These are authenticated to Kerberos periodically but are independent of the token renewal time. The tokens have a maximum life span identical to the Kerberos ticket maximum life span.
 
-- `jobmanager.web.port`: Port of the JobManager's web interface (DEFAULT: 8081).
+Please make sure to set the maximum ticket life span high long running jobs. The renewal time of the ticket, on the other hand, is not important because Hadoop abstracts this away using its own security tocken renewal system. Hadoop makes sure that tickets are renewed in time and you can be sure to be authenticated until the end of the ticket life time.
+
+If you are on YARN, then it is sufficient to authenticate the client with Kerberos. On a Flink standalone cluster you need to ensure that, initially, all nodes are authenticated with Kerberos using the `kinit` tool.
+
+### Other
 
-- `fs.overwrite-files`: Specifies whether file output writers should overwrite
-existing files by default. Set to *true* to overwrite by default, *false* otherwise.
-(DEFAULT: false)
+- `taskmanager.tmp.dirs`: The directory for temporary files, or a list of directories separated by the systems directory delimiter (for example ':' (colon) on Linux/Unix). If multiple directories are specified, then the temporary files will be distributed across the directories in a round-robin fashion. The I/O manager component will spawn one reading and one writing thread per directory. A directory may be listed multiple times to have the I/O manager use multiple threads for it (for example if it is physically stored on a very fast disc or RAID) (DEFAULT: The system's tmp dir).
 
-- `fs.output.always-create-directory`: File writers running with a parallelism
-larger than one create a directory for the output file path and put the different
-result files (one per parallel writer task) into that directory. If this option
-is set to *true*, writers with a parallelism of 1 will also create a directory
-and place a single result file into it. If the option is set to *false*, the
-writer will directly create the file directly at the output path, without
-creating a containing directory. (DEFAULT: false)
+- `jobmanager.web.port`: Port of the JobManager's web interface (DEFAULT: 8081).
 
-- `taskmanager.network.numberOfBuffers`: The number of buffers available to the
-network stack. This number determines how many streaming data exchange channels
-a TaskManager can have at the same time and how well buffered the channels are.
-If a job is rejected or you get a warning that the system has not enough buffers
-available, increase this value (DEFAULT: 2048).
+- `fs.overwrite-files`: Specifies whether file output writers should overwrite existing files by default. Set to *true* to overwrite by default, *false* otherwise. (DEFAULT: false)
 
-- `env.java.opts`: Set custom JVM options. This value is respected by Flink's start scripts
-and Flink's YARN client.
-This can be used to set different garbage collectors or to include remote debuggers into
-the JVMs running Flink's services.
+- `fs.output.always-create-directory`: File writers running with a parallelism larger than one create a directory for the output file path and put the different result files (one per parallel writer task) into that directory. If this option is set to *true*, writers with a parallelism of 1 will also create a directory and place a single result file into it. If the option is set to *false*, the writer will directly create the file directly at the output path, without creating a containing directory. (DEFAULT: false)
 
-- `state.backend`: The backend that will be used to store operator state checkpoints if checkpointing is enabled.
+- `taskmanager.network.numberOfBuffers`: The number of buffers available to the network stack. This number determines how many streaming data exchange channels a TaskManager can have at the same time and how well buffered the channels are. If a job is rejected or you get a warning that the system has not enough buffers available, increase this value (DEFAULT: 2048).
 
-  Supported backends:
+- `env.java.opts`: Set custom JVM options. This value is respected by Flink's start scripts and Flink's YARN client. This can be used to set different garbage collectors or to include remote debuggers into the JVMs running Flink's services.
 
+- `state.backend`: The backend that will be used to store operator state checkpoints if checkpointing is enabled. Supported backends:
    -  `jobmanager`: In-memory state, backup to JobManager's/ZooKeeper's memory. Should be used only for minimal state (Kafka offsets) or testing and local debugging.
    -  `filesystem`: State is in-memory on the TaskManagers, and state snapshots are stored in a file system. Supported are all filesystems supported by Flink, for example HDFS, S3, ...
 
-- `state.backend.fs.checkpointdir`: Directory for storing checkpoints in a flink supported filesystem
-Note: State backend must be accessible from the JobManager, use file:// only for local setups.
+- `state.backend.fs.checkpointdir`: Directory for storing checkpoints in a flink supported filesystem Note: State backend must be accessible from the JobManager, use file:// only for local setups.
 
 - `blob.storage.directory`: Directory for storing blobs (such as user jar's) on the TaskManagers.
 
-- `blob.server.port`: Port definition for the blob server (serving user jar's) on the Taskmanagers.
-By default the port is set to 0, which means that the operating system is picking an ephemeral port.
-Flink also accepts a list of ports ("50100,50101"), ranges ("50100-50200") or a combination of both.
-It is recommended to set a range of ports to avoid collisions when multiple JobManagers are running
-on the same machine.
+- `blob.server.port`: Port definition for the blob server (serving user jar's) on the Taskmanagers. By default the port is set to 0, which means that the operating system is picking an ephemeral port. Flink also accepts a list of ports ("50100,50101"), ranges ("50100-50200") or a combination of both. It is recommended to set a range of ports to avoid collisions when multiple JobManagers are running on the same machine.
 
-- `execution-retries.delay`: Delay between execution retries. Default value "5 s". Note that values
-have to be specified as strings with a unit.
+- `execution-retries.delay`: Delay between execution retries. Default value "5 s". Note that values have to be specified as strings with a unit.
 
-- `execution-retries.default`: Default number of execution retries, used by jobs that do not explicitly
-specify that value on the execution environment. Default value is zero.
+- `execution-retries.default`: Default number of execution retries, used by jobs that do not explicitly specify that value on the execution environment. Default value is zero.
 
 ## Full Reference
 
 ### HDFS
 
-These parameters configure the default HDFS used by Flink. Setups that do not
-specify a HDFS configuration have to specify the full path to
-HDFS files (`hdfs://address:port/path/to/files`) Files will also be written
-with default HDFS parameters (block size, replication factor).
+These parameters configure the default HDFS used by Flink. Setups that do not specify a HDFS configuration have to specify the full path to HDFS files (`hdfs://address:port/path/to/files`) Files will also be written with default HDFS parameters (block size, replication factor).
 
-- `fs.hdfs.hadoopconf`: The absolute path to the Hadoop configuration directory.
-The system will look for the "core-site.xml" and "hdfs-site.xml" files in that
-directory (DEFAULT: null).
-- `fs.hdfs.hdfsdefault`: The absolute path of Hadoop's own configuration file
-"hdfs-default.xml" (DEFAULT: null).
-- `fs.hdfs.hdfssite`: The absolute path of Hadoop's own configuration file
-"hdfs-site.xml" (DEFAULT: null).
+- `fs.hdfs.hadoopconf`: The absolute path to the Hadoop configuration directory. The system will look for the "core-site.xml" and "hdfs-site.xml" files in that directory (DEFAULT: null).
+- `fs.hdfs.hdfsdefault`: The absolute path of Hadoop's own configuration file "hdfs-default.xml" (DEFAULT: null).
+- `fs.hdfs.hdfssite`: The absolute path of Hadoop's own configuration file "hdfs-site.xml" (DEFAULT: null).
 
 ### JobManager &amp; TaskManager
 
 The following parameters configure Flink's JobManager and TaskManagers.
 
-- `jobmanager.rpc.address`: The IP address of the JobManager, which is the
-master/coordinator of the distributed system (DEFAULT: localhost).
+- `jobmanager.rpc.address`: The IP address of the JobManager, which is the master/coordinator of the distributed system (DEFAULT: localhost).
 - `jobmanager.rpc.port`: The port number of the JobManager (DEFAULT: 6123).
 - `taskmanager.rpc.port`: The task manager's IPC port (DEFAULT: 6122).
-- `taskmanager.data.port`: The task manager's port used for data exchange
-operations (DEFAULT: 6121).
-- `jobmanager.heap.mb`: JVM heap size (in megabytes) for the JobManager
-(DEFAULT: 256).
-- `taskmanager.heap.mb`: JVM heap size (in megabytes) for the TaskManagers,
-which are the parallel workers of the system. In
-contrast to Hadoop, Flink runs operators (e.g., join, aggregate) and
-user-defined functions (e.g., Map, Reduce, CoGroup) inside the TaskManager
-(including sorting/hashing/caching), so this value should be as
-large as possible (DEFAULT: 512). On YARN setups, this value is automatically
-configured to the size of the TaskManager's YARN container, minus a
-certain tolerance value.
-- `taskmanager.numberOfTaskSlots`: The number of parallel operator or
-user function instances that a single TaskManager can run (DEFAULT: 1).
-If this value is larger than 1, a single TaskManager takes multiple instances of
-a function or operator. That way, the TaskManager can utilize multiple CPU cores,
-but at the same time, the available memory is divided between the different
-operator or function instances.
-This value is typically proportional to the number of physical CPU cores that
-the TaskManager's machine has (e.g., equal to the number of cores, or half the
-number of cores).
-- `taskmanager.tmp.dirs`: The directory for temporary files, or a list of
-directories separated by the systems directory delimiter (for example ':'
-(colon) on Linux/Unix). If multiple directories are specified, then the temporary
-files will be distributed across the directories in a round robin fashion. The
-I/O manager component will spawn one reading and one writing thread per
-directory. A directory may be listed multiple times to have the I/O manager use
-multiple threads for it (for example if it is physically stored on a very fast
-disc or RAID) (DEFAULT: The system's tmp dir).
-- `taskmanager.network.numberOfBuffers`: The number of buffers available to the
-network stack. This number determines how many streaming data exchange channels
-a TaskManager can have at the same time and how well buffered the channels are.
-If a job is rejected or you get a warning that the system has not enough buffers
-available, increase this value (DEFAULT: 2048).
-- `taskmanager.memory.size`: The amount of memory (in megabytes) that the task
-manager reserves on the JVM's heap space for sorting, hash tables, and caching
-of intermediate results. If unspecified (-1), the memory manager will take a fixed
-ratio of the heap memory available to the JVM, as specified by
-`taskmanager.memory.fraction`. (DEFAULT: -1)
-- `taskmanager.memory.fraction`: The relative amount of memory that the task
-manager reserves for sorting, hash tables, and caching of intermediate results.
-For example, a value of 0.8 means that TaskManagers reserve 80% of the
-JVM's heap space for internal data buffers, leaving 20% of the JVM's heap space
-free for objects created by user-defined functions. (DEFAULT: 0.7)
-This parameter is only evaluated, if `taskmanager.memory.size` is not set.
-- `jobclient.polling.interval`: The interval (in seconds) in which the client
-polls the JobManager for the status of its job (DEFAULT: 2).
-- `taskmanager.heartbeat-interval`: The interval in which the TaskManager sends
-heartbeats to the JobManager.
-- `jobmanager.max-heartbeat-delay-before-failure.msecs`: The maximum time that a
-TaskManager hearbeat may be missing before the TaskManager is considered failed.
+- `taskmanager.data.port`: The task manager's port used for data exchange operations (DEFAULT: 6121).
+- `jobmanager.heap.mb`: JVM heap size (in megabytes) for the JobManager (DEFAULT: 256).
+- `taskmanager.heap.mb`: JVM heap size (in megabytes) for the TaskManagers, which are the parallel workers of the system. In contrast to Hadoop, Flink runs operators (e.g., join, aggregate) and user-defined functions (e.g., Map, Reduce, CoGroup) inside the TaskManager (including sorting/hashing/caching), so this value should be as large as possible (DEFAULT: 512). On YARN setups, this value is automatically configured to the size of the TaskManager's YARN container, minus a certain tolerance value.
+- `taskmanager.numberOfTaskSlots`: The number of parallel operator or user function instances that a single TaskManager can run (DEFAULT: 1). If this value is larger than 1, a single TaskManager takes multiple instances of a function or operator. That way, the TaskManager can utilize multiple CPU cores, but at the same time, the available memory is divided between the different operator or function instances. This value is typically proportional to the number of physical CPU cores that the TaskManager's machine has (e.g., equal to the number of cores, or half the number of cores).
+- `taskmanager.tmp.dirs`: The directory for temporary files, or a list of directories separated by the systems directory delimiter (for example ':' (colon) on Linux/Unix). If multiple directories are specified, then the temporary files will be distributed across the directories in a round robin fashion. The I/O manager component will spawn one reading and one writing thread per directory. A directory may be listed multiple times to have the I/O manager use multiple threads for it (for example if it is physically stored on a very fast disc or RAID) (DEFAULT: The system's tmp dir).
+- `taskmanager.network.numberOfBuffers`: The number of buffers available to the network stack. This number determines how many streaming data exchange channels a TaskManager can have at the same time and how well buffered the channels are. If a job is rejected or you get a warning that the system has not enough buffers available, increase this value (DEFAULT: 2048).
+- `taskmanager.memory.size`: The amount of memory (in megabytes) that the task manager reserves on the JVM's heap space for sorting, hash tables, and caching of intermediate results. If unspecified (-1), the memory manager will take a fixed ratio of the heap memory available to the JVM, as specified by `taskmanager.memory.fraction`. (DEFAULT: -1)
+- `taskmanager.memory.fraction`: The relative amount of memory that the task manager reserves for sorting, hash tables, and caching of intermediate results. For example, a value of 0.8 means that TaskManagers reserve 80% of the JVM's heap space for internal data buffers, leaving 20% of the JVM's heap space free for objects created by user-defined functions. (DEFAULT: 0.7) This parameter is only evaluated, if `taskmanager.memory.size` is not set.
+- `jobclient.polling.interval`: The interval (in seconds) in which the client polls the JobManager for the status of its job (DEFAULT: 2).
+- `taskmanager.heartbeat-interval`: The interval in which the TaskManager sends heartbeats to the JobManager.
+- `jobmanager.max-heartbeat-delay-before-failure.msecs`: The maximum time that a TaskManager hearbeat may be missing before the TaskManager is considered failed.
 
 ### Distributed Coordination (via Akka)
 
@@ -328,125 +157,66 @@ TaskManager hearbeat may be missing before the TaskManager is considered failed.
 
 ### JobManager Web Frontend
 
-- `jobmanager.web.port`: Port of the JobManager's web interface that displays
-status of running jobs and execution time breakdowns of finished jobs
-(DEFAULT: 8081). Setting this value to `-1` disables the web frontend.
-- `jobmanager.web.history`: The number of latest jobs that the JobManager's web
-front-end in its history (DEFAULT: 5).
+- `jobmanager.web.port`: Port of the JobManager's web interface that displays status of running jobs and execution time breakdowns of finished jobs (DEFAULT: 8081). Setting this value to `-1` disables the web frontend.
+- `jobmanager.web.history`: The number of latest jobs that the JobManager's web front-end in its history (DEFAULT: 5).
 
 ### Webclient
 
-These parameters configure the web interface that can be used to submit jobs and
-review the compiler's execution plans.
+These parameters configure the web interface that can be used to submit jobs and review the compiler's execution plans.
 
 - `webclient.port`: The port of the webclient server (DEFAULT: 8080).
-- `webclient.tempdir`: The temp directory for the web server. Used for example
-for caching file fragments during file-uploads (DEFAULT: The system's temp
-directory).
-- `webclient.uploaddir`: The directory into which the web server will store
-uploaded programs (DEFAULT: ${webclient.tempdir}/webclient-jobs/).
-- `webclient.plandump`: The directory into which the web server will dump
-temporary JSON files describing the execution plans
-(DEFAULT: ${webclient.tempdir}/webclient-plans/).
+- `webclient.tempdir`: The temp directory for the web server. Used for example for caching file fragments during file-uploads (DEFAULT: The system's temp directory).
+- `webclient.uploaddir`: The directory into which the web server will store uploaded programs (DEFAULT: ${webclient.tempdir}/webclient-jobs/).
+- `webclient.plandump`: The directory into which the web server will dump temporary JSON files describing the execution plans (DEFAULT: ${webclient.tempdir}/webclient-plans/).
 
 ### File Systems
 
 The parameters define the behavior of tasks that create result files.
 
-- `fs.overwrite-files`: Specifies whether file output writers should overwrite
-existing files by default. Set to *true* to overwrite by default, *false* otherwise.
-(DEFAULT: false)
-- `fs.output.always-create-directory`: File writers running with a parallelism
-larger than one create a directory for the output file path and put the different
-result files (one per parallel writer task) into that directory. If this option
-is set to *true*, writers with a parallelism of 1 will also create a directory
-and place a single result file into it. If the option is set to *false*, the
-writer will directly create the file directly at the output path, without
-creating a containing directory. (DEFAULT: false)
+- `fs.overwrite-files`: Specifies whether file output writers should overwrite existing files by default. Set to *true* to overwrite by default, *false* otherwise. (DEFAULT: false)
+- `fs.output.always-create-directory`: File writers running with a parallelism larger than one create a directory for the output file path and put the different result files (one per parallel writer task) into that directory. If this option is set to *true*, writers with a parallelism of 1 will also create a directory and place a single result file into it. If the option is set to *false*, the writer will directly create the file directly at the output path, without creating a containing directory. (DEFAULT: false)
 
 ### Compiler/Optimizer
 
-- `compiler.delimited-informat.max-line-samples`: The maximum number of line
-samples taken by the compiler for delimited inputs. The samples are used to
-estimate the number of records. This value can be overridden for a specific
-input with the input format's parameters (DEFAULT: 10).
-- `compiler.delimited-informat.min-line-samples`: The minimum number of line
-samples taken by the compiler for delimited inputs. The samples are used to
-estimate the number of records. This value can be overridden for a specific
-input with the input format's parameters (DEFAULT: 2).
-- `compiler.delimited-informat.max-sample-len`: The maximal length of a line
-sample that the compiler takes for delimited inputs. If the length of a single
-sample exceeds this value (possible because of misconfiguration of the parser),
-the sampling aborts. This value can be overridden for a specific input with the
-input format's parameters (DEFAULT: 2097152 (= 2 MiBytes)).
+- `compiler.delimited-informat.max-line-samples`: The maximum number of line samples taken by the compiler for delimited inputs. The samples are used to estimate the number of records. This value can be overridden for a specific input with the input format's parameters (DEFAULT: 10).
+- `compiler.delimited-informat.min-line-samples`: The minimum number of line samples taken by the compiler for delimited inputs. The samples are used to estimate the number of records. This value can be overridden for a specific input with the input format's parameters (DEFAULT: 2).
+- `compiler.delimited-informat.max-sample-len`: The maximal length of a line sample that the compiler takes for delimited inputs. If the length of a single sample exceeds this value (possible because of misconfiguration of the parser), the sampling aborts. This value can be overridden for a specific input with the input format's parameters (DEFAULT: 2097152 (= 2 MiBytes)).
 
 ### Runtime Algorithms
 
-- `taskmanager.runtime.max-fan`: The maximal fan-in for external merge joins and
-fan-out for spilling hash tables. Limits the number of file handles per operator,
-but may cause intermediate merging/partitioning, if set too small (DEFAULT: 128).
-- `taskmanager.runtime.sort-spilling-threshold`: A sort operation starts spilling
-when this fraction of its memory budget is full (DEFAULT: 0.8).
+- `taskmanager.runtime.max-fan`: The maximal fan-in for external merge joins and fan-out for spilling hash tables. Limits the number of file handles per operator, but may cause intermediate merging/partitioning, if set too small (DEFAULT: 128).
+- `taskmanager.runtime.sort-spilling-threshold`: A sort operation starts spilling when this fraction of its memory budget is full (DEFAULT: 0.8).
 - `taskmanager.runtime.hashjoin-bloom-filters`: If true, the hash join uses bloom filters to pre-filter records against spilled partitions. (DEFAULT: true)
 
-
 ## YARN
 
-
-- `yarn.heap-cutoff-ratio`: (Default 0.25) Percentage of heap space to remove from containers started by YARN.
-When a user requests a certain amount of memory for each TaskManager container (for example 4 GB),
-we can not pass this amount as the maximum heap space for the JVM (`-Xmx` argument) because the JVM
-is also allocating memory outside the heap. YARN is very strict with killing containers which are using
-more memory than requested.
-Therefore, we remove a 15% of the memory from the requested heap as a safety margin.
+- `yarn.heap-cutoff-ratio`: (Default 0.25) Percentage of heap space to remove from containers started by YARN. When a user requests a certain amount of memory for each TaskManager container (for example 4 GB), we can not pass this amount as the maximum heap space for the JVM (`-Xmx` argument) because the JVM is also allocating memory outside the heap. YARN is very strict with killing containers which are using more memory than requested. Therefore, we remove a 15% of the memory from the requested heap as a safety margin.
 - `yarn.heap-cutoff-min`: (Default 384 MB) Minimum amount of memory to cut off the requested heap size.
 
 - `yarn.reallocate-failed` (Default 'true') Controls whether YARN should reallocate failed containers
 
-- `yarn.maximum-failed-containers` (Default: number of requested containers). Maximum number of containers the system
-is going to reallocate in case of a failure.
+- `yarn.maximum-failed-containers` (Default: number of requested containers). Maximum number of containers the system is going to reallocate in case of a failure.
 
-- `yarn.application-attempts` (Default: 1). Number of ApplicationMaster restarts. Note that that the entire Flink cluster
-will restart and the YARN Client will loose the connection. Also, the JobManager address will change and you'll need
-to set the JM host:port manually. It is recommended to leave this option at 1.
+- `yarn.application-attempts` (Default: 1). Number of ApplicationMaster restarts. Note that that the entire Flink cluster will restart and the YARN Client will loose the connection. Also, the JobManager address will change and you'll need to set the JM host:port manually. It is recommended to leave this option at 1.
 
 - `yarn.heartbeat-delay` (Default: 5 seconds). Time between heartbeats with the ResourceManager.
 
-- `yarn.properties-file.location` (Default: temp directory). When a Flink job is submitted to YARN,
-the JobManager's host and the number of available processing slots is written into a properties file,
-so that the Flink client is able to pick those details up. This configuration parameter allows
-changing the default location of that file (for example for environments sharing a Flink
-installation between users)
-
-- `yarn.application-master.env.`*ENV_VAR1=value* Configuration values prefixed with `yarn.application-master.env.`
-will be passed as environment variables to the ApplicationMaster/JobManager process.
-For example for passing `LD_LIBRARY_PATH` as an env variable to the ApplicationMaster, set:
-
-      yarn.application-master.env.LD_LIBRARY_PATH: "/usr/lib/native"
+- `yarn.properties-file.location` (Default: temp directory). When a Flink job is submitted to YARN, the JobManager's host and the number of available processing slots is written into a properties file, so that the Flink clientis able to pick those details up. This configuration parameter allows changing the default location of that file (for example for environments sharing a Flink installation between users)
 
+- `yarn.application-master.env.`*ENV_VAR1=value* Configuration values prefixed with `yarn.application-master.env.` will be passed as environment variables to the ApplicationMaster/JobManager process. For example for passing `LD_LIBRARY_PATH` as an env variable to the ApplicationMaster, set:
+	
+	yarn.application-master.env.LD_LIBRARY_PATH: "/usr/lib/native"
 
-- `yarn.taskmanager.env.` Similar to the configuration prefix about, this prefix allows setting custom
-environment variables for the TaskManager processes.
+- `yarn.taskmanager.env.` Similar to the configuration prefix about, this prefix allows setting custom environment variables for the TaskManager processes.
 
+- `yarn.application-master.port` (Default: 0, which lets the OS choose an ephemeral port) With this configuration option, users can specify a port, a range of ports or a list of ports for the  Application Master (and JobManager) RPC port. By default we recommend using the default value (0) to let the operating system choose an appropriate port. In particular when multiple AMs are running on the  same physical host, fixed port assignments prevent the AM from starting.
 
-- `yarn.application-master.port` (Default: 0, which lets the OS choose an ephemeral port)
-With this configuration option, users can specify a port, a range of ports or a list of ports for the 
-Application Master (and JobManager) RPC port. By default we recommend using the default value (0) to
-let the operating system choose an appropriate port. In particular when multiple AMs are running on the 
-same physical host, fixed port assignments prevent the AM from starting.
-
-For example when running Flink on YARN on an environment with a restrictive firewall, this
-option allows specifying a range of allowed ports.
+For example when running Flink on YARN on an environment with a restrictive firewall, this option allows specifying a range of allowed ports.
 
 
 ## High Availability Mode
 
-- `recovery.mode`: (Default 'standalone') Defines the recovery mode used for the cluster execution. Currently,
-Flink supports the 'standalone' mode where only a single JobManager runs and no JobManager state is checkpointed.
-The high availability mode 'zookeeper' supports the execution of multiple JobManagers and JobManager state checkpointing.
-Among the group of JobManagers, ZooKeeper elects one of them as the leader which is responsible for the cluster execution.
-In case of a JobManager failure, a standby JobManager will be elected as the new leader and is given the last checkpointed JobManager state.
-In order to use the 'zookeeper' mode, it is mandatory to also define the `recovery.zookeeper.quorum` configuration value.
+- `recovery.mode`: (Default 'standalone') Defines the recovery mode used for the cluster execution. Currently, Flink supports the 'standalone' mode where only a single JobManager runs and no JobManager state is checkpointed. The high availability mode 'zookeeper' supports the execution of multiple JobManagers and JobManager state checkpointing. Among the group of JobManagers, ZooKeeper elects one of them as the leader which is responsible for the cluster execution. In case of a JobManager failure, a standby JobManager will be elected as the new leader and is given the last checkpointed JobManager state. In order to use the 'zookeeper' mode, it is mandatory to also define the `recovery.zookeeper.quorum` configuration value.
 
 - `recovery.zookeeper.quorum`: Defines the ZooKeeper quorum URL which is used to connet to the ZooKeeper cluster when the 'zookeeper' recovery mode is selected
 
@@ -468,70 +238,34 @@ In order to use the 'zookeeper' mode, it is mandatory to also define the `recove
 
 ### Configuring the Network Buffers
 
-Network buffers are a critical resource for the communication layers. They are
-used to buffer records before transmission over a network, and to buffer
-incoming data before dissecting it into records and handing them to the
-application. A sufficient number of network buffers is critical to achieve a
-good throughput.
-
-In general, configure the task manager to have enough buffers that each logical
-network connection on you expect to be open at the same time has a dedicated
-buffer. A logical network connection exists for each point-to-point exchange of
-data over the network, which typically happens at repartitioning- or
-broadcasting steps. In those, each parallel task inside the TaskManager has to
-be able to talk to all other parallel tasks. Hence, the required number of
-buffers on a task manager is *total-degree-of-parallelism* (number of targets)
-\* *intra-node-parallelism* (number of sources in one task manager) \* *n*.
-Here, *n* is a constant that defines how many repartitioning-/broadcasting steps
-you expect to be active at the same time.
-
-Since the *intra-node-parallelism* is typically the number of cores, and more
-than 4 repartitioning or broadcasting channels are rarely active in parallel, it
-frequently boils down to *\#cores\^2\^* \* *\#machines* \* 4. To support for
-example a cluster of 20 8-core machines, you should use roughly 5000 network
-buffers for optimal throughput.
-
-Each network buffer has by default a size of 32 KiBytes. In the above example, the
-system would allocate roughly 300 MiBytes for network buffers.
-
-The number and size of network buffers can be configured with the following
-parameters:
+Network buffers are a critical resource for the communication layers. They are used to buffer records before transmission over a network, and to buffer incoming data before dissecting it into records and handing them to the
+application. A sufficient number of network buffers is critical to achieve a good throughput.
+
+In general, configure the task manager to have enough buffers that each logical network connection on you expect to be open at the same time has a dedicated buffer. A logical network connection exists for each point-to-point exchange of data over the network, which typically happens at repartitioning- or broadcasting steps. In those, each parallel task inside the TaskManager has to be able to talk to all other parallel tasks. Hence, the required number of buffers on a task manager is *total-degree-of-parallelism* (number of targets) \* *intra-node-parallelism* (number of sources in one task manager) \* *n*. Here, *n* is a constant that defines how many repartitioning-/broadcasting steps you expect to be active at the same time.
+
+Since the *intra-node-parallelism* is typically the number of cores, and more than 4 repartitioning or broadcasting channels are rarely active in parallel, it frequently boils down to *\#cores\^2\^* \* *\#machines* \* 4. To support for example a cluster of 20 8-core machines, you should use roughly 5000 network buffers for optimal throughput.
+
+Each network buffer has by default a size of 32 KiBytes. In the above example, the system would allocate roughly 300 MiBytes for network buffers.
+
+The number and size of network buffers can be configured with the following parameters:
 
 - `taskmanager.network.numberOfBuffers`, and
 - `taskmanager.memory.segment-size`.
 
 ### Configuring Temporary I/O Directories
 
-Although Flink aims to process as much data in main memory as possible,
-it is not uncommon that more data needs to be processed than memory is
-available. Flink's runtime is designed to write temporary data to disk
-to handle these situations.
-
-The `taskmanager.tmp.dirs` parameter specifies a list of directories into which
-Flink writes temporary files. The paths of the directories need to be
-separated by ':' (colon character). Flink will concurrently write (or
-read) one temporary file to (from) each configured directory. This way,
-temporary I/O can be evenly distributed over multiple independent I/O devices
-such as hard disks to improve performance. To leverage fast I/O devices (e.g.,
-SSD, RAID, NAS), it is possible to specify a directory multiple times.
+Although Flink aims to process as much data in main memory as possible, it is not uncommon that more data needs to be processed than memory is available. Flink's runtime is designed to write temporary data to disk to handle these situations.
 
-If the `taskmanager.tmp.dirs` parameter is not explicitly specified,
-Flink writes temporary data to the temporary directory of the operating
-system, such as */tmp* in Linux systems.
+The `taskmanager.tmp.dirs` parameter specifies a list of directories into which Flink writes temporary files. The paths of the directories need to be separated by ':' (colon character). Flink will concurrently write (or read) one temporary file to (from) each configured directory. This way, temporary I/O can be evenly distributed over multiple independent I/O devices such as hard disks to improve performance. To leverage fast I/O devices (e.g., SSD, RAID, NAS), it is possible to specify a directory multiple times.
 
+If the `taskmanager.tmp.dirs` parameter is not explicitly specified, Flink writes temporary data to the temporary directory of the operating system, such as */tmp* in Linux systems.
 
 ### Configuring TaskManager processing slots
 
 Flink executes a program in parallel by splitting it into subtasks and scheduling these subtasks to processing slots.
 
-Each Flink TaskManager provides processing slots in the cluster. The number of slots
-is typically proportional to the number of available CPU cores __of each__ TaskManager.
-As a general recommendation, the number of available CPU cores is a good default for
-`taskmanager.numberOfTaskSlots`.
+Each Flink TaskManager provides processing slots in the cluster. The number of slots is typically proportional to the number of available CPU cores __of each__ TaskManager. As a general recommendation, the number of available CPU cores is a good default for `taskmanager.numberOfTaskSlots`.
 
-When starting a Flink application, users can supply the default number of slots to use for that job.
-The command line value therefore is called `-p` (for parallelism). In addition, it is possible
-to [set the number of slots in the programming APIs]({{site.baseurl}}/apis/programming_guide.html#parallel-execution) for
-the whole application and individual operators.
+When starting a Flink application, users can supply the default number of slots to use for that job. The command line value therefore is called `-p` (for parallelism). In addition, it is possible to [set the number of slots in the programming APIs]({{site.baseurl}}/apis/programming_guide.html#parallel-execution) for the whole application and individual operators.
 
 <img src="fig/slots_parallelism.svg" class="img-responsive" />
diff --git a/docs/setup/flink_on_tez.md b/docs/setup/flink_on_tez.md
deleted file mode 100644
index afbd147e353fc..0000000000000
--- a/docs/setup/flink_on_tez.md
+++ /dev/null
@@ -1,290 +0,0 @@
----
-title: "Running Flink on YARN leveraging Tez"
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-<a href="#top"></a>
-
-You can run Flink using Tez as an execution environment. Flink on Tez 
-is currently included in *flink-staging* in alpha. All classes are
-located in the *org.apache.flink.tez* package.
-
-* This will be replaced by the TOC
-{:toc}
-
-## Why Flink on Tez
-
-[Apache Tez](http://tez.apache.org) is a scalable data processing
-platform. Tez provides an API for specifying a directed acyclic
-graph (DAG), and functionality for placing the DAG vertices in YARN
-containers, as well as data shuffling.  In Flink's architecture,
-Tez is at about the same level as Flink's network stack. While Flink's
-network stack focuses heavily on low latency in order to support 
-pipelining, data streaming, and iterative algorithms, Tez
-focuses on scalability and elastic resource usage.
-
-Thus, by replacing Flink's network stack with Tez, users can get scalability
-and elastic resource usage in shared clusters while retaining Flink's 
-APIs, optimizer, and runtime algorithms (local sorts, hash tables, etc).
-
-Flink programs can run almost unmodified using Tez as an execution
-environment. Tez supports local execution (e.g., for debugging), and 
-remote execution on YARN.
-
-
-## Local execution
-
-The `LocalTezEnvironment` can be used run programs using the local
-mode provided by Tez. This example shows how WordCount can be run using the Tez local mode.
-It is identical to a normal Flink WordCount, except that the `LocalTezEnvironment` is used.
-To run in local Tez mode, you can simply run a Flink on Tez program
-from your IDE (e.g., right click and run).
-  
-{% highlight java %}
-public class WordCountExample {
-    public static void main(String[] args) throws Exception {
-        final LocalTezEnvironment env = LocalTezEnvironment.create();
-
-        DataSet<String> text = env.fromElements(
-            "Who's there?",
-            "I think I hear them. Stand, ho! Who's there?");
-
-        DataSet<Tuple2<String, Integer>> wordCounts = text
-            .flatMap(new LineSplitter())
-            .groupBy(0)
-            .sum(1);
-
-        wordCounts.print();
-
-        env.execute("Word Count Example");
-    }
-
-    public static class LineSplitter implements FlatMapFunction<String, Tuple2<String, Integer>> {
-        @Override
-        public void flatMap(String line, Collector<Tuple2<String, Integer>> out) {
-            for (String word : line.split(" ")) {
-                out.collect(new Tuple2<String, Integer>(word, 1));
-            }
-        }
-    }
-}
-{% endhighlight %}
-
-## YARN execution
-
-### Setup
-
-- Install Tez on your Hadoop 2 cluster following the instructions from the
-  [Apache Tez website](http://tez.apache.org/install.html). If you are able to run 
-  the examples that ship with Tez, then Tez has been successfully installed.
-  
-- Currently, you need to build Flink yourself to obtain Flink on Tez
-  (the reason is a Hadoop version compatibility: Tez releases artifacts
-  on Maven central with a Hadoop 2.6.0 dependency). Build Flink
-  using `mvn -DskipTests clean package -Pinclude-tez -Dhadoop.version=X.X.X -Dtez.version=X.X.X`.
-  Make sure that the Hadoop version matches the version that Tez uses.
-  Obtain the jar file contained in the Flink distribution under
-  `flink-staging/flink-tez/target/flink-tez-x.y.z-flink-fat-jar.jar` 
-  and upload it to some directory in HDFS. E.g., to upload the file
-  to the directory `/apps`, execute
-  {% highlight bash %}
-  $ hadoop fs -put /path/to/flink-tez-x.y.z-flink-fat-jar.jar /apps
-  {% endhighlight %}  
- 
-- Edit the tez-site.xml configuration file, adding an entry that points to the
-  location of the file. E.g., assuming that the file is in the directory `/apps/`, 
-  add the following entry to tez-site.xml:
-    {% highlight xml %}
-<property>
-  <name>tez.aux.uris</name>
-  <value>${fs.default.name}/apps/flink-tez-x.y.z-flink-fat-jar.jar</value>
-</property>
-    {% endhighlight %}  
-    
-- At this point, you should be able to run the pre-packaged examples, e.g., run WordCount:
-  {% highlight bash %}
-  $ hadoop jar /path/to/flink-tez-x.y.z-flink-fat-jar.jar wc hdfs:/path/to/text hdfs:/path/to/output
-  {% endhighlight %}  
-
-
-### Packaging your program
-
-Application packaging is currently a bit different than in Flink standalone mode.
-  Flink programs that run on Tez need to be packaged in a "fat jar"
-  file that contain the Flink client. This jar can then be executed via the `hadoop jar` command.
-  An easy way to do that is to use the provided `flink-tez-quickstart` maven archetype.
-  Create a new project as
-  
-  {% highlight bash %}
-  $ mvn archetype:generate                             \
-    -DarchetypeGroupId=org.apache.flink              \
-    -DarchetypeArtifactId=flink-tez-quickstart           \
-    -DarchetypeVersion={{site.version}}
-  {% endhighlight %}
-  
-  and specify the group id, artifact id, version, and package of your project. For example,
-  let us assume the following options: `org.myorganization`, `flink-on-tez`, `0.1`, and `org.myorganization`.
-  You should see the following output on your terminal:
-  
-  {% highlight bash %}
-  $ mvn archetype:generate -DarchetypeGroupId=org.apache.flink -DarchetypeArtifactId=flink-tez-quickstart
-  [INFO] Scanning for projects...
-  [INFO]
-  [INFO] ------------------------------------------------------------------------
-  [INFO] Building Maven Stub Project (No POM) 1
-  [INFO] ------------------------------------------------------------------------
-  [INFO]
-  [INFO] >>> maven-archetype-plugin:2.2:generate (default-cli) > generate-sources @ standalone-pom >>>
-  [INFO]
-  [INFO] <<< maven-archetype-plugin:2.2:generate (default-cli) < generate-sources @ standalone-pom <<<
-  [INFO]
-  [INFO] --- maven-archetype-plugin:2.2:generate (default-cli) @ standalone-pom ---
-  [INFO] Generating project in Interactive mode
-  [INFO] Archetype [org.apache.flink:flink-tez-quickstart:0.9-SNAPSHOT] found in catalog local
-  Define value for property 'groupId': : org.myorganization
-  Define value for property 'artifactId': : flink-on-tez
-  Define value for property 'version':  1.0-SNAPSHOT: : 0.1
-  Define value for property 'package':  org.myorganization: :
-  Confirm properties configuration:
-  groupId: org.myorganization
-  artifactId: flink-on-tez
-  version: 0.1
-  package: org.myorganization
-   Y: : Y
-  [INFO] ----------------------------------------------------------------------------
-  [INFO] Using following parameters for creating project from Archetype: flink-tez-quickstart:0.9-SNAPSHOT
-  [INFO] ----------------------------------------------------------------------------
-  [INFO] Parameter: groupId, Value: org.myorganization
-  [INFO] Parameter: artifactId, Value: flink-on-tez
-  [INFO] Parameter: version, Value: 0.1
-  [INFO] Parameter: package, Value: org.myorganization
-  [INFO] Parameter: packageInPathFormat, Value: org/myorganization
-  [INFO] Parameter: package, Value: org.myorganization
-  [INFO] Parameter: version, Value: 0.1
-  [INFO] Parameter: groupId, Value: org.myorganization
-  [INFO] Parameter: artifactId, Value: flink-on-tez
-  [INFO] project created from Archetype in dir: /Users/kostas/Dropbox/flink-tez-quickstart-test/flink-on-tez
-  [INFO] ------------------------------------------------------------------------
-  [INFO] BUILD SUCCESS
-  [INFO] ------------------------------------------------------------------------
-  [INFO] Total time: 44.130 s
-  [INFO] Finished at: 2015-02-26T17:59:45+01:00
-  [INFO] Final Memory: 15M/309M
-  [INFO] ------------------------------------------------------------------------
-  {% endhighlight %}
-  
-  The project contains an example called `YarnJob.java` that provides the skeleton 
-  for a Flink-on-Tez job. Program execution is currently done using Hadoop's `ProgramDriver`, 
-  see the `Driver.java` class for an example. Create the fat jar using 
-  `mvn -DskipTests clean package`. The resulting jar will be located in the `target/` directory. 
-  You can now execute a job as follows:
-  
-  {% highlight bash %}
-$ mvn -DskipTests clean package
-$ hadoop jar flink-on-tez/target/flink-on-tez-0.1-flink-fat-jar.jar yarnjob [command-line parameters]
-  {% endhighlight %}
-  
-  Flink programs that run on YARN using Tez as an execution engine need to use the `RemoteTezEnvironment` and 
-  register the class that contains the `main` method with that environment:
-  {% highlight java %}
-  public class WordCountExample {
-      public static void main(String[] args) throws Exception {
-          final RemoteTezEnvironment env = RemoteTezEnvironment.create();
-  
-          DataSet<String> text = env.fromElements(
-              "Who's there?",
-              "I think I hear them. Stand, ho! Who's there?");
-  
-          DataSet<Tuple2<String, Integer>> wordCounts = text
-              .flatMap(new LineSplitter())
-              .groupBy(0)
-              .sum(1);
-  
-          wordCounts.print();
-      
-          env.registerMainClass(WordCountExample.class);
-          env.execute("Word Count Example");
-      }
-  
-      public static class LineSplitter implements FlatMapFunction<String, Tuple2<String, Integer>> {
-          @Override
-          public void flatMap(String line, Collector<Tuple2<String, Integer>> out) {
-              for (String word : line.split(" ")) {
-                  out.collect(new Tuple2<String, Integer>(word, 1));
-              }
-          }
-      }
-  }
-  {% endhighlight %}
-
-
-## How it works
-
-Flink on Tez reuses the Flink APIs, the Flink optimizer,
-and the Flink local runtime, including Flink's hash table and sort implementations. Tez
-replaces Flink's network stack and control plan, and is responsible for scheduling and
-network shuffles.
-
-The figure below shows how a Flink program passes through the Flink stack and generates
-a Tez DAG (instead of a JobGraph that would be created using normal Flink execution).
-
-<div style="text-align: center;">
-<img src="fig/flink_on_tez_translation.png" alt="Translation of a Flink program to a Tez DAG." height="600px" vspace="20px" style="text-align: center;"/>
-</div>
-
-All local processing, including memory management, sorting, and hashing is performed by
-Flink as usual. Local processing is encapsulated in Tez vertices, as seen in the figure
-below. Tez vertices are connected by edges. Tez is currently based on a key-value data
-model. In the current implementation, the elements that are processed by Flink operators
-are wrapped inside Tez values, and the Tez key field is used to indicate the index of the target task
-that the elements are destined to.
-
-<div style="text-align: center;">
-<img src="fig/flink_tez_vertex.png" alt="Encapsulation of Flink runtime inside Tez vertices." height="200px" vspace="20px" style="text-align: center;"/>
-</div>
-
-## Limitations
-
-Currently, Flink on Tez does not support all features of the Flink API. We are working
-to enable all of the missing features listed below. In the meantime, if your project depends on these features, we suggest
-to use [Flink on YARN]({{site.baseurl}}/setup/yarn_setup.html) or [Flink standalone]({{site.baseurl}}/quickstart/setup_quickstart.html).
-
-The following features are currently missing.
-
-- Dedicated client: jobs need to be submitted via Hadoop's command-line client
-
-- Self-joins: currently binary operators that receive the same input are not supported due to 
-  [TEZ-1190](https://issues.apache.org/jira/browse/TEZ-1190).
-
-- Iterative programs are currently not supported.
-
-- Broadcast variables are currently not supported.
-
-- Accummulators and counters are currently not supported.
-
-- Performance: The current implementation has not been heavily tested for performance, and misses several optimizations,
-  including task chaining.
-
-- Streaming API: Streaming programs will not currently compile to Tez DAGs.
-
-- Scala API: The current implementation has only been tested with the Java API.
-
-
-
diff --git a/docs/setup/gce_setup.md b/docs/setup/gce_setup.md
index f6499dc7cd193..32e22d2fdfa48 100644
--- a/docs/setup/gce_setup.md
+++ b/docs/setup/gce_setup.md
@@ -1,5 +1,8 @@
 ---
 title:  "Google Compute Engine Setup"
+top-nav-group: deployment
+top-nav-title: Google Compute Engine
+top-nav-pos: 4
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -21,12 +24,7 @@ under the License.
 -->
 
 
-This documentation provides instructions on how to setup Flink fully
-automatically with Hadoop 1 or Hadoop 2 on top of a
-[Google Compute Engine](https://cloud.google.com/compute/) cluster. This is made
-possible by Google's [bdutil](https://cloud.google.com/hadoop/bdutil) which
-starts a cluster and deploys Flink with Hadoop. To get started, just follow the
-steps below.
+This documentation provides instructions on how to setup Flink fully automatically with Hadoop 1 or Hadoop 2 on top of a [Google Compute Engine](https://cloud.google.com/compute/) cluster. This is made possible by Google's [bdutil](https://cloud.google.com/hadoop/bdutil) which starts a cluster and deploys Flink with Hadoop. To get started, just follow the steps below.
 
 * This will be replaced by the TOC
 {:toc}
@@ -35,13 +33,10 @@ steps below.
 
 ## Install Google Cloud SDK
 
-Please follow the instructions on how to setup the
-[Google Cloud SDK](https://cloud.google.com/sdk/). In particular, make sure to
-authenticate with Google Cloud using the following command:
+Please follow the instructions on how to setup the [Google Cloud SDK](https://cloud.google.com/sdk/). In particular, make sure to authenticate with Google Cloud using the following command:
 
     gcloud auth login
 
-
 ## Install bdutil
 
 At the moment, there is no bdutil release yet which includes the Flink
@@ -50,15 +45,13 @@ from [GitHub](https://github.com/GoogleCloudPlatform/bdutil):
 
     git clone https://github.com/GoogleCloudPlatform/bdutil.git
 
-After you have downloaded the source, change into the newly created `bdutil`
-directory and continue with the next steps.
+After you have downloaded the source, change into the newly created `bdutil` directory and continue with the next steps.
 
 # Deploying Flink on Google Compute Engine
 
 ## Set up a bucket
 
-If you have not done so, create a bucket for the bdutil config and
-staging files. A new bucket can be created with gsutil:
+If you have not done so, create a bucket for the bdutil config and staging files. A new bucket can be created with gsutil:
 
     gsutil mb gs://<bucket_name>
 
@@ -79,11 +72,7 @@ bdutil_env.sh.
 
 ## Adapt the Flink config
 
-bdutil's Flink extension handles the configuration for you. You may additionally
-adjust configuration variables in `extensions/flink/flink_env.sh`. If you want
-to make further configuration, please take a look at
-[configuring Flink](config.html). You will have to restart Flink after changing
-its configuration using `bin/stop-cluster` and `bin/start-cluster`.
+bdutil's Flink extension handles the configuration for you. You may additionally adjust configuration variables in `extensions/flink/flink_env.sh`. If you want to make further configuration, please take a look at [configuring Flink](config.html). You will have to restart Flink after changing its configuration using `bin/stop-cluster` and `bin/start-cluster`.
 
 ## Bring up a cluster with Flink
 
diff --git a/docs/setup/jobmanager_high_availability.md b/docs/setup/jobmanager_high_availability.md
index ca3bdb29ea705..3c054fdccd835 100644
--- a/docs/setup/jobmanager_high_availability.md
+++ b/docs/setup/jobmanager_high_availability.md
@@ -1,5 +1,8 @@
 ---
 title: "JobManager High Availability (HA)"
+top-nav-group: deployment
+top-nav-title: High Availability
+top-nav-pos: 5
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
diff --git a/docs/setup/local_setup.md b/docs/setup/local_setup.md
index d2c53458e0613..54864f065c118 100644
--- a/docs/setup/local_setup.md
+++ b/docs/setup/local_setup.md
@@ -1,5 +1,8 @@
 ---
 title:  "Local Setup"
+top-nav-group: deployment
+top-nav-title: Local
+top-nav-pos: 1
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -27,7 +30,9 @@ This documentation is intended to provide instructions on how to run Flink local
 
 ## Download
 
-Go to the [downloads page]({{ site.download_url}}) and get the ready to run package. If you want to interact with Hadoop (e.g. HDFS or HBase), make sure to pick the Flink package **matching your Hadoop version**. When in doubt or you plan to just work with the local file system pick the package for Hadoop 1.2.x.
+Go to the [downloads page]({{ site.download_url }}) and get the ready to run package. If you want to interact with Hadoop (e.g. HDFS or HBase), make sure to pick the Flink package **matching your Hadoop version**. When in doubt or you plan to just work with the local file system pick the package for Hadoop 1.2.x.
+
+{% top %}
 
 ## Requirements
 
@@ -42,17 +47,21 @@ java -version
 The command should output something comparable to the following:
 
 ~~~bash
-java version "1.6.0_22"
-Java(TM) SE Runtime Environment (build 1.6.0_22-b04)
-Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03, mixed mode)
+java version "1.8.0_51"
+Java(TM) SE Runtime Environment (build 1.8.0_51-b16)
+Java HotSpot(TM) 64-Bit Server VM (build 25.51-b03, mixed mode)
 ~~~
 
+{% top %}
+
 ## Configuration
 
 **For local mode Flink is ready to go out of the box and you don't need to change the default configuration.**
 
 The out of the box configuration will use your default Java installation. You can manually set the environment variable `JAVA_HOME` or the configuration key `env.java.home` in `conf/flink-conf.yaml` if you want to manually override the Java runtime to use. Consult the [configuration page](config.html) for further details about configuring Flink.
 
+{% top %}
+
 ## Starting Flink
 
 **You are now ready to start Flink.** Unpack the downloaded archive and change to the newly created `flink` directory. There you can start Flink in local mode:
@@ -77,6 +86,8 @@ INFO ... - Starting web info server for JobManager on port 8081
 
 The JobManager will also start a web frontend on port 8081, which you can check with your browser at `http://localhost:8081`.
 
+{% top %}
+
 ## Flink on Windows
 
 If you want to run Flink on Windows you need to download, unpack and configure the Flink archive as mentioned above. After that you can either use the **Windows Batch** file (`.bat`) or use **Cygwin**  to run the Flink Jobmanager.
@@ -97,6 +108,8 @@ Do not close this batch window. Stop job manager by pressing Ctrl+C.
 
 After that, you need to open a second terminal to run jobs using `flink.bat`.
 
+{% top %}
+
 ### Starting with Cygwin and Unix Scripts
 
 With *Cygwin* you need to start the Cygwin Terminal, navigate to your Flink directory and run the `start-local.sh` script:
@@ -107,6 +120,8 @@ $ bin/start-local.sh
 Starting Nephele job manager
 ~~~
 
+{% top %}
+
 ### Installing Flink from Git
 
 If you are installing Flink from the git repository and you are using the Windows git shell, Cygwin can produce a failure similiar to this one:
@@ -136,3 +151,5 @@ set -o igncr
 
 Save the file and open a new bash shell.
 
+{% top %}
+
diff --git a/docs/setup/yarn_setup.md b/docs/setup/yarn_setup.md
index a7309e448db76..d905d68d68d9f 100644
--- a/docs/setup/yarn_setup.md
+++ b/docs/setup/yarn_setup.md
@@ -1,5 +1,8 @@
 ---
 title:  "YARN Setup"
+top-nav-group: deployment
+top-nav-title: YARN
+top-nav-pos: 3
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -23,7 +26,9 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Quickstart: Start a long-running Flink cluster on YARN
+## Quickstart
+
+### Start a long-running Flink cluster on YARN
 
 Start a YARN session with 4 Task Managers (each with 4 GB of Heapspace):
 
@@ -40,7 +45,7 @@ Specify the `-s` flag for the number of processing slots per Task Manager. We re
 
 Once the session has been started, you can submit jobs to the cluster using the `./bin/flink` tool.
 
-## Quickstart: Run a Flink job on YARN
+### Run a Flink job on YARN
 
 ~~~bash
 # get the hadoop2 package from the Flink download page at
@@ -51,7 +56,7 @@ cd flink-{{ site.version }}/
 ./bin/flink run -m yarn-cluster -yn 4 -yjm 1024 -ytm 4096 ./examples/WordCount.jar
 ~~~
 
-## Apache Flink on Hadoop YARN using a YARN Session
+## Flink YARN Session
 
 Apache [Hadoop YARN](http://hadoop.apache.org/) is a cluster resource management framework. It allows to run various distributed applications on top of a cluster. Flink runs on YARN next to other applications. Users do not have to setup or install anything if there is already a YARN setup.
 
@@ -68,9 +73,9 @@ Follow these instructions to learn how to launch a Flink Session within your YAR
 
 A session will start all required Flink services (JobManager and TaskManagers) so that you can submit programs to the cluster. Note that you can run multiple programs per session.
 
-#### Download Flink for YARN
+#### Download Flink 
 
-Download the YARN tgz package on the [download page]({{site.baseurl}}/downloads.html). It contains the required files.
+Download a Flink package for Hadoop >= 2 from the [download page]({{ site.download_url }}). It contains the required files.
 
 Extract the package using:
 
@@ -79,10 +84,6 @@ tar xvzf flink-{{ site.version }}-bin-hadoop2.tgz
 cd flink-{{site.version }}/
 ~~~
 
-If you want to build the YARN .tgz file from sources, follow the [build instructions](building.html).
-You can find the result of the build in `flink-dist/target/flink-{{ site.version }}-bin/flink-{{ site.version }}/` (*Note: The version might be different for you* ).
-
-
 #### Start a Session
 
 Use the following command to start a session
@@ -106,7 +107,6 @@ Usage:
      -qu,--queue <arg>               Specify YARN queue.
      -s,--slots <arg>                Number of slots per TaskManager
      -tm,--taskManagerMemory <arg>   Memory per TaskManager Container [in MB]
-
 ~~~
 
 Please note that the Client requires the `YARN_CONF_DIR` or `HADOOP_CONF_DIR` environment variable to be set to read the YARN and HDFS configuration.
@@ -129,7 +129,7 @@ Once Flink is deployed in your YARN cluster, it will show you the connection det
 
 Stop the YARN session by stopping the unix process (using CTRL+C) or by entering 'stop' into the client.
 
-#### Detached YARN session
+#### Detached YARN Session
 
 If you do not want to keep the Flink YARN client running all the time, its also possible to start a *detached* YARN session.
 The parameter for that is called `-d` or `--detached`.
@@ -139,7 +139,6 @@ Note that in this case its not possible to stop the YARN session using Flink.
 
 Use the YARN utilities (`yarn application -kill <appId`) to stop the YARN session.
 
-
 ### Submit Job to Flink
 
 Use the following command to submit a Flink program to the YARN cluster:
@@ -195,10 +194,9 @@ You can check the number of TaskManagers in the JobManager web interface. The ad
 If the TaskManagers do not show up after a minute, you should investigate the issue using the log files.
 
 
-## Run a single Flink job on Hadoop YARN
+## Run a single Flink job on YARN
 
-The documentation above describes how to start a Flink cluster within a Hadoop YARN environment.
-It is also possible to launch Flink within YARN only for executing a single job.
+The documentation above describes how to start a Flink cluster within a Hadoop YARN environment. It is also possible to launch Flink within YARN only for executing a single job.
 
 Please note that the client then expects the `-yn` value to be set (number of TaskManagers).
 
@@ -208,15 +206,11 @@ Please note that the client then expects the `-yn` value to be set (number of Ta
 ./bin/flink run -m yarn-cluster -yn 2 ./examples/WordCount.jar
 ~~~
 
-The command line options of the YARN session are also available with the `./bin/flink` tool.
-They are prefixed with a `y` or `yarn` (for the long argument options).
-
-Note: You can use a different configuration directory per job by setting the environment variable `FLINK_CONF_DIR`.
-To use this copy the `conf` directory from the Flink distribution and modify, for example, the logging settings on a per-job basis.
+The command line options of the YARN session are also available with the `./bin/flink` tool. They are prefixed with a `y` or `yarn` (for the long argument options).
 
-Note: It is possible to combine `-m yarn-cluster` with a detached YARN submission (`-yd`) to "fire and forget" a Flink job
-to the YARN cluster. In this case, your application will not get any accumulator results or exceptions from the ExecutionEnvironment.execute() call!
+Note: You can use a different configuration directory per job by setting the environment variable `FLINK_CONF_DIR`. To use this copy the `conf` directory from the Flink distribution and modify, for example, the logging settings on a per-job basis.
 
+Note: It is possible to combine `-m yarn-cluster` with a detached YARN submission (`-yd`) to "fire and forget" a Flink job to the YARN cluster. In this case, your application will not get any accumulator results or exceptions from the ExecutionEnvironment.execute() call!
 
 ## Recovery behavior of Flink on YARN
 
@@ -226,7 +220,6 @@ Flink's YARN client has the following configuration parameters to control how to
 - `yarn.maximum-failed-containers`: The maximum number of failed containers the ApplicationMaster accepts until it fails the YARN session. Default: The number of initally requested TaskManagers (`-n`).
 - `yarn.application-attempts`: The number of ApplicationMaster (+ its TaskManager containers) attempts. If this value is set to 1 (default), the entire YARN session will fail when the Application master fails. Higher values specify the number of restarts of the ApplicationMaster by YARN.
 
-
 ## Debugging a failed YARN session
 
 There are many reasons why a Flink YARN session deployment can fail. A misconfigured Hadoop setup (HDFS permissions, YARN configuration), version incompatibilities (running Flink with vanilla Hadoop dependencies on Cloudera Hadoop) or other errors.
@@ -251,24 +244,22 @@ In addition to that, there is the YARN Resource Manager webinterface (by default
 
 It allows to access log files for running YARN applications and shows diagnostics for failed apps.
 
-
 ## Build YARN client for a specific Hadoop version
 
 Users using Hadoop distributions from companies like Hortonworks, Cloudera or MapR might have to build Flink against their specific versions of Hadoop (HDFS) and YARN. Please read the [build instructions](building.html) for more details.
 
-
 ## Running Flink on YARN behind Firewalls
 
 Some YARN clusters use firewalls for controlling the network traffic between the cluster and the rest of the network.
 In those setups, Flink jobs can only be submitted to a YARN session from within the cluster's network (behind the firewall).
-If this is not feasible for production use, Flink allows to configure a port range for all relevant services. With these 
+If this is not feasible for production use, Flink allows to configure a port range for all relevant services. With these
 ranges configured, users can also submit jobs to Flink crossing the firewall.
 
 Currently, two services are needed to submit a job:
 
  * The JobManager (ApplicatonMaster in YARN)
  * The BlobServer running within the JobManager.
- 
+
 When submitting a job to Flink, the BlobServer will distribute the jars with the user code to all worker nodes (TaskManagers).
 The JobManager receives the job itself and triggers the execution.
 
@@ -300,4 +291,3 @@ The next step of the client is to request (step 2) a YARN container to start the
 The *JobManager* and AM are running in the same container. Once they successfully started, the AM knows the address of the JobManager (its own host). It is generating a new Flink configuration file for the TaskManagers (so that they can connect to the JobManager). The file is also uploaded to HDFS. Additionally, the *AM* container is also serving Flink's web interface. The ports Flink is using for its services are the standard ports configured by the user + the application id as an offset. This allows users to execute multiple Flink YARN sessions in parallel.
 
 After that, the AM starts allocating the containers for Flink's TaskManagers, which will download the jar file and the modified configuration from the HDFS. Once these steps are completed, Flink is set up and ready to accept Jobs.
-