Skip to content

Commit

Permalink
Various documentation improvements
Browse files Browse the repository at this point in the history
  • Loading branch information
awicenec committed Sep 15, 2021
1 parent 4a1992e commit 667268e
Show file tree
Hide file tree
Showing 28 changed files with 214 additions and 153 deletions.
3 changes: 1 addition & 2 deletions daliuge-engine/dlg/manager/web/VERSION
Original file line number Diff line number Diff line change
@@ -1,2 +1 @@
1.0.0:master
7f80e3c41e4d1201e4f776dcd66ae897ad37be8b
1.0.0
2 changes: 2 additions & 0 deletions daliuge-engine/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,7 @@ def run(self):
"web/static/fonts/*",
"web/static/js/*.js",
"web/static/js/d3/*",
"web/static/icons/*",
],
"dlg.dropmake": [
"web/lg_editor.html",
Expand All @@ -182,6 +183,7 @@ def run(self):
"web/pg_viewer.html",
"web/matrix_vis.html",
"lib/libmetis.*",
"web/static/icons/*",
],
"test.dropmake": ["logical_graphs/*.json"],
"test.apps": ["dynlib_example.c", "dynlib_example2.c"],
Expand Down
12 changes: 6 additions & 6 deletions docs/architecture/dataflow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,13 +29,13 @@ formalism to describe parallel computation, early efforts in developing
had to introduce control flow operators (e.g. switch and merge) and data
storage mechanism in order to put dataflow models into practice.

.. _dataflow.datadriven:
.. _dataflow.data-activated:

Data-driven
^^^^^^^^^^^
Data-activated
^^^^^^^^^^^^^^
In developing |daliuge|, we have extended the "traditional" dataflow
model by integrating data lifecycle management, graph execution engine, and
cost-optimal resource allocation into a coherent data-driven framework.
cost-optimal resource allocation into a coherent *data-activated* framework.
Concretely, we have made the following changes to the existing dataflow model:

* Unlike traditional dataflow models that characterise data as "tokens" moving
Expand All @@ -52,7 +52,7 @@ Concretely, we have made the following changes to the existing dataflow model:
after restart, etc., but also enables data sharing amongst multiple processing
pipelines in situations like re-processing or commensal observations.
All the state information is kept in the Drop wrapper, while the payload of the
Drops, i.e. pipeline component algorithms and data, are stateless.
Drops, i.e. pipeline component algorithms and data, remain stateless.

* We introduced a small number of control flow graph nodes at the logical level
such as *Scatter*, *Gather*, *GroupBy*, *Loop*, etc. These additional control
Expand All @@ -62,7 +62,7 @@ Concretely, we have made the following changes to the existing dataflow model:
ordinary Drops at the physical level. Thus they are nearly transparent to the
underlying graph/dataflow execution engine, which focuses solely on exploring
parallelisms orthogonal to these control nodes placed by applications. In this
way, the Data-Driven framework enjoys the best from both worlds - expressivity
way, the data-activated framework enjoys the best from both worlds - expressivity
at the application level and flexibility at the dataflow system level.

* Finally, we differentiate between two kinds of dataflow graphs - **Logical Graph** and
Expand Down
2 changes: 1 addition & 1 deletion docs/architecture/dlm.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Data Lifecycle Manager
----------------------

As mentioned in :ref:`intro` and :ref:`dataflow.datadriven` |daliuge| also integrates
As mentioned in :ref:`intro` and :ref:`dataflow.data-activated` |daliuge| also integrates
a data lifecycle management within the data processing framework. Its purpose is
to make sure the data is dealt with correctly in terms of storage, taking into
account how and when it is used. This includes, for instance, placing medium-
Expand Down
4 changes: 2 additions & 2 deletions docs/architecture/drops.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ all events sent by all Drops and make use of them.
Relationships
^^^^^^^^^^^^^

Drops are connected between them and create a graph representing an execution
Drops are connected and create a dependency graph representing an execution
plan, where inputs and outputs are connected to applications, establishing the
following possible relationships:

Expand Down Expand Up @@ -79,7 +79,7 @@ responsibility of the application to ensure that the I/O is occurring in the
correct location and using the expected format for storage or subsequent
upstream processing by other application Drops.

|daliuge| provides various commonly used data Drops with their associated I/O
|daliuge| provides various commonly used :ref:`data components <data_index>` with their associated I/O
storage classes, including in-memory, file-base and S3 storages.

.. _drop.channels:
Expand Down
90 changes: 0 additions & 90 deletions docs/development/app_development.rst

This file was deleted.

32 changes: 32 additions & 0 deletions docs/development/app_development/I_O.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
I/O
===

An application's input and output drops
are accessed through its
:class:`inputs <dlg.drop.AppDROP.inputs>` and
:attr:`outputs <dlg.drop.AppDROP.outputs>` members.
Both of these are lists of :class:`drops <dlg.drop.AbstractDROP>`,
and will be sorted in the same order
in which inputs and outputs
were defined in the Logical Graph.
Each element can also be queried
for its :attr:`uid <dlg.drop.AbstractDROP.uid>`.

Data can be read from input drops,
and written in output drops.
To read data from an input drop,
one calls first the drop's
:attr:`open <dlg.drop.AbstractDROP.open>` method,
which returns a descriptor to the opened drop.
Using this descriptor one can perform successive calls to
:attr:`read <dlg.drop.AbstractDROP.read>`,
which will return the data stored in the drop.
Finally, the drop's
:attr:`close <dlg.drop.AbstractDROP.close>` method
should be called
to ensure that all internal resources are freed.

Writing data into an output drop is similar but simpler.
Application authors need only call one or more times the
:attr:`write <dlg.drop.AbstractDROP.write>` method
with the data that needs to be written.
32 changes: 32 additions & 0 deletions docs/development/app_development/app_index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
* :ref:`genindex`
* :ref:`search`

.. _app_index:

|daliuge| Application Component Developers Guide
################################################

This chapter describes what developers need to do
to write a new application component that can be used
as an Application Drop during the execution of a |daliuge| graph.

Detailed instructions can be found in the respective sections forwar
each type of components. There are also separate sections describing
integration and testing during component development.

*NOTE: The DCDG is work in progress!*

.. toctree::
:maxdepth: 2

bash_components
python_components
python_function_components
dynlib_components
docker_components
service_components
I_O
wrap_existing
test_and_debug
eagle_integration
deployment_testing
6 changes: 6 additions & 0 deletions docs/development/app_development/bash_advanced.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
.. .. _advanced_bash:
..
.. Advanced Bash Components
.. ------------------------
TODO
64 changes: 64 additions & 0 deletions docs/development/app_development/bash_components.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
.. _bash_components:

Bash Components
===============
These are probably the easiest components to implement and for simple ones it is possible to do all the 'development' in EAGLE.

'Hello World' in Bash through EAGLE
-----------------------------------
Steps

* Open EAGLE (e.g. https://eagle.icrar.org) and create a new graph (Graph --> New --> Create New Graph)
* Drag and drop a 'Bash Shell App' from the 'All Nodes' palette on the right hand panel onto the canvas.
* Click on the 'Bash Shell App' title bar in the Inspector tab on the right hand panel. This will open all additional settings below.
* First change the 'Name' field of the app in the 'Display Options' menu. Call it 'Hello Word'. Once you leave the entry field also the black title bar will reflect that new name.
* Now change the description of the app in the 'Description' menu. Maybe you write 'Simple Hello World bash app'.
* now go down to the 'Component Parameters' menu and enter the bash command in the 'Command' field::
echo "Hello World"

* Now save your new toy graph (Graph --> Local Storage --> Save Graph).

That should give you the idea how to use bash commands as |daliuge| components. Seems not a lot? Well, actually this is allowing you to execute whatever can be executed on the command line where the engine is running as part of a |daliuge| graph. That includes all bash commands, but also every other executable available on the PATH of the engine. Now that is a bit more exciting, but the excitement stops as soon as you think about real world (not Hello World) examples: Really useful commands will require inputs and outputs in the form of command line parameters and files or pipes. This is discussed in the :ref:`advanced_bash` chapter.

Verification
------------

Do we believe that this is actually really working? Well, probably not. Thus let's just translate and execute this graph. Note that the graph has neither an input nor an output defined, thus there is nothing you could really expect from running it. However, the |daliuge| engine is pretty verbose when run in debug mode and thus we will use that to investigate what's happening. The following steps are very helpful when it comes to debugging actual components.

Assuming you have a translator and an engine running you can actually translate and execute this, pretty useless, graph. If you have the engine running locally in development mode, you can even see the output in the session log file::

cd /tmp/dlg/logs
ls -ltra dlg_*

The output of the ls command looks like::

-rw-r--r-- 1 root root 1656 Sep 14 16:46 dlg_172.17.0.3_Diagram-2021-09-14-16-41-283_2021-09-14T08-46-17.341082.log
-rw-r--r-- 1 root root 6991 Sep 14 16:46 dlg_172.17.0.3_Diagram-2021-09-14-16-41-284_2021-09-14T08-46-52.618798.log
-rw-r--r-- 1 root root 6991 Sep 14 16:47 dlg_172.17.03_Diagram-2021-09-14-16-41-284_2021-09-14T08-47-28.890072.log

There could be a lot more lines on top, but the important one os the last line, which is the log-file of the session last executed on the engine. Just dump the content to the screen in a terminal::

cat dlg_172.17.03_Diagram-2021-09-14-16-41-284_2021-09-14T08-47-28.890072.log

Since the engine is running in debugging mode there will be many lines in this file, but towards the end you will find something like::

2021-09-14 08:47:28,912 [ INFO] [ Thread-62] [2021-09-14] dlg.apps.bash_shell_app#_run_bash:217 Finished in 0.006 [s] with exit code 0
2021-09-14 08:47:28,912 [DEBUG] [ Thread-62] [2021-09-14] dlg.apps.bash_shell_app#_run_bash:220 Command finished successfully, output follows:
==STDOUT==
Hello World

2021-09-14 08:47:28,912 [DEBUG] [ Thread-62] [2021-09-14] dlg.manager.node_manager#handleEvent:65 AppDrop uid=2021-09-14T08:46:48_-1_0, oid=2021-09-14T08:46:48_-1_0 changed to execState 2
2021-09-14 08:47:28,912 [DEBUG] [ Thread-62] [2021-09-14] dlg.manager.node_manager#handleEvent:63 Drop uid=2021-09-14T08:46:48_-1_0, oid=2021-09-14T08:46:48_-1_0 changed to state 2

In addition to the session log file the same information is also contained in the dlgNM.log file in the same directory. That file contains all logs produced by the node manager for all sessions and more, which is usually pretty distracting. However, the name of the session logs are not known before you deploy the session and thus another trick is to monitor the dlgNM.log using the tail command::

tail -f dlgNM.log

When you now deploy the graph again and watch the terminal output, you will see a lot of messages pass through.

.. _advanced_bash:

Advanced Bash Components
------------------------
.. include:: bash_advanced.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,5 @@

Deployment Testing
==================

TODO
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,5 @@

Docker Components
=================

TODO
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,5 @@

Dynlib Components
=================

TODO
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
.. _eagle_integration:

Automatic EAGLE Component Description Generation
------------------------------------------------
Automatic EAGLE Palette Generation
----------------------------------

In order to support the direct usage of newly written application components in the EAGLE editor, the |daliuge| system includes a custom set of Doxygen directives and tools. When writing an application component, developers can add specific custom
In order to support the direct usage of newly written application components in the EAGLE editor, the |daliuge| system supports a custom set of Doxygen directives and tools. When writing an application component, developers can add specific custom
`Doxygen <https://www.doxygen.nl/>`_ comments to the source code.
These comments describe the application and can
be used to automatically generate a DALiuGE component so that the
application can be used in the *EAGLE* Logical Graph Editor.
be used to automatically generate a JSON DALiuGE component description
which can be used in the *EAGLE* Logical Graph Editor.

The comments should be contained within a *EAGLE_START* and *EAGLE_END*
pair.
Expand Down Expand Up @@ -103,5 +103,5 @@ a continuous integration step can then use the tools provided by the |daliuge| s
The processing will:

* combine the Doxygen output XML into a single XML file
* transform the XML into an EAGLE palette file
* push the palette file to the *ICRAR/EAGLE_test_repo* repository.
* transform the XML into an EAGLE palette file (JSON)
* push the palette file to a GitHub/GitLab repository (optional).
15 changes: 15 additions & 0 deletions docs/development/app_development/python_components.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
.. default-domain:: py

.. _python_components:

Python Components
=================

Developers need to write a new python class
that derives from the :class:`dlg.drop.BarrierAppDROP` class.
This base class defines all methods and attributes
that derived class need to function correctly.
This new class will need a single method
called :attr:`run <dlg.drop.InputFiredAppDROP.run>`,
that receives no arguments,
and executes the logic of the application.
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,5 @@

Python Function Components
==========================

TODO
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,5 @@

Service Components
==================

TODO
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,5 @@

Test And Debug
==============

TODO
Loading

0 comments on commit 667268e

Please sign in to comment.