Skip to content

Commit

Permalink
Add Airflow screenshots
Browse files Browse the repository at this point in the history
  • Loading branch information
erssebaggala committed Jan 3, 2018
1 parent 8253687 commit 1a8ecea
Show file tree
Hide file tree
Showing 10 changed files with 25 additions and 4 deletions.
Binary file modified _build/doctrees/environment.pickle
Binary file not shown.
Binary file modified _build/doctrees/mediation.doctree
Binary file not shown.
Binary file added _build/html/_images/dag_execution_time.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
11 changes: 10 additions & 1 deletion _build/html/_sources/mediation.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,14 @@ One of the biggest challenges we have had to address is how to handle the many c
necessary to load data into the database, transform it, and perform other domain specific processing.

Rather than re-invent the wheel we carefully surveyed the available open source data pipeline and ETL tools. We narrowed down on Apache's `Airflow <http://airflow.apache.org>`_ project
started at Airbnb. What it does is pure magic!
started at Airbnb. What it does is pure magic! Below is a sample of the Ericsson 3g4g ETL process defined as a DAG(Directed Acyclic Graph) in Airflow. Each dependency is clearly defined and easy to track.

.. image:: sample_etl_for_ericsson_3g4g.png
:alt: Sample ETL data pipeline for Ericsson 3g and 4g configuration management data

The next figure below shows the duration of the entire process with the time each sub- task took displayed in a Gantt chart. Identifying which process is the bottleneck become a trivial task.

.. image:: dag_execution_time.png
:alt: Sample ETL data pipeline for Ericsson 3g and 4g configuration management data


5 changes: 4 additions & 1 deletion _build/html/mediation.html
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,10 @@ <h1>Mediation<a class="headerlink" href="#mediation" title="Permalink to this he
<p>One of the biggest challenges we have had to address is how to handle the many complex depend data pipelines/workflows
necessary to load data into the database, transform it, and perform other domain specific processing.</p>
<p>Rather than re-invent the wheel we carefully surveyed the available open source data pipeline and ETL tools. We narrowed down on Apache’s <a class="reference external" href="http://airflow.apache.org">Airflow</a> project
started at Airbnb. What it does is pure magic!</p>
started at Airbnb. What it does is pure magic! Below is a sample of the Ericsson 3g4g ETL process defined as a DAG(Directed Acyclic Graph) in Airflow. Each dependency is clearly defined and easy to track.</p>
<img alt="Sample ETL data pipeline for Ericsson 3g and 4g configuration management data" src="_images/sample_etl_for_ericsson_3g4g.png" />
<p>The next figure below shows the duration of the entire process with the time each sub- task took displayed in a Gantt chart. Identifying which process is the bottleneck become a trivial task.</p>
<img alt="Sample ETL data pipeline for Ericsson 3g and 4g configuration management data" src="_images/dag_execution_time.png" />
</div>


Expand Down
2 changes: 1 addition & 1 deletion _build/html/searchindex.js

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Binary file added dag_execution_time.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
11 changes: 10 additions & 1 deletion mediation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,14 @@ One of the biggest challenges we have had to address is how to handle the many c
necessary to load data into the database, transform it, and perform other domain specific processing.

Rather than re-invent the wheel we carefully surveyed the available open source data pipeline and ETL tools. We narrowed down on Apache's `Airflow <http://airflow.apache.org>`_ project
started at Airbnb. What it does is pure magic!
started at Airbnb. What it does is pure magic! Below is a sample of the Ericsson 3g4g ETL process defined as a DAG(Directed Acyclic Graph) in Airflow. Each dependency is clearly defined and easy to track.

.. image:: sample_etl_for_ericsson_3g4g.png
:alt: Sample ETL data pipeline for Ericsson 3g and 4g configuration management data

The next figure below shows the duration of the entire process with the time each sub- task took displayed in a Gantt chart. Identifying which process is the bottleneck become a trivial task.

.. image:: dag_execution_time.png
:alt: Sample ETL data pipeline for Ericsson 3g and 4g configuration management data


Binary file added sample_etl_for_ericsson_3g4g.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 1a8ecea

Please sign in to comment.