Various documentation improvements

ICRAR · Sep 15, 2021 · 667268e · 667268e
1 parent 4a1992e
commit 667268e
Show file tree

Hide file tree

Showing 28 changed files with 214 additions and 153 deletions.
diff --git a/daliuge-engine/dlg/manager/web/VERSION b/daliuge-engine/dlg/manager/web/VERSION
@@ -1,2 +1 @@
-1.0.0:master
-7f80e3c41e4d1201e4f776dcd66ae897ad37be8b
+1.0.0
diff --git a/daliuge-engine/setup.py b/daliuge-engine/setup.py
@@ -171,6 +171,7 @@ def run(self):
             "web/static/fonts/*",
             "web/static/js/*.js",
             "web/static/js/d3/*",
+            "web/static/icons/*",
         ],
         "dlg.dropmake": [
             "web/lg_editor.html",
@@ -182,6 +183,7 @@ def run(self):
             "web/pg_viewer.html",
             "web/matrix_vis.html",
             "lib/libmetis.*",
+            "web/static/icons/*",
         ],
         "test.dropmake": ["logical_graphs/*.json"],
         "test.apps": ["dynlib_example.c", "dynlib_example2.c"],

diff --git a/docs/architecture/dataflow.rst b/docs/architecture/dataflow.rst
@@ -29,13 +29,13 @@ formalism to describe parallel computation, early efforts in developing
 had to introduce control flow operators (e.g.  switch and merge) and data
 storage mechanism in order to put dataflow models into practice.
 
-.. _dataflow.datadriven:
+.. _dataflow.data-activated:
 
-Data-driven
-^^^^^^^^^^^
+Data-activated
+^^^^^^^^^^^^^^
 In developing |daliuge|, we have extended the "traditional" dataflow
 model by integrating data lifecycle management, graph execution engine, and
-cost-optimal resource allocation into a coherent data-driven framework.
+cost-optimal resource allocation into a coherent *data-activated* framework.
 Concretely, we have made the following changes to the existing dataflow model:
 
 * Unlike traditional dataflow models that characterise data as "tokens" moving
@@ -52,7 +52,7 @@ Concretely, we have made the following changes to the existing dataflow model:
   after restart, etc., but also enables data sharing amongst multiple processing
   pipelines in situations like re-processing or commensal observations.
   All the state information is kept in the Drop wrapper, while the payload of the
-  Drops, i.e. pipeline component algorithms and data, are stateless.
+  Drops, i.e. pipeline component algorithms and data, remain stateless.
 
 * We introduced a small number of control flow graph nodes at the logical level
   such as *Scatter*, *Gather*, *GroupBy*, *Loop*, etc. These additional control
@@ -62,7 +62,7 @@ Concretely, we have made the following changes to the existing dataflow model:
   ordinary Drops at the physical level. Thus they are nearly transparent to the
   underlying graph/dataflow execution engine, which focuses solely on exploring
   parallelisms orthogonal to these control nodes placed by applications. In this
-  way, the Data-Driven framework enjoys the best from both worlds - expressivity
+  way, the data-activated framework enjoys the best from both worlds - expressivity
   at the application level and flexibility at the dataflow system level.
 
 * Finally, we differentiate between two kinds of dataflow graphs - **Logical Graph** and

diff --git a/docs/architecture/dlm.rst b/docs/architecture/dlm.rst
@@ -1,7 +1,7 @@
 Data Lifecycle Manager
 ----------------------
 
-As mentioned in :ref:`intro` and :ref:`dataflow.datadriven` |daliuge| also integrates
+As mentioned in :ref:`intro` and :ref:`dataflow.data-activated` |daliuge| also integrates
 a data lifecycle management within the data processing framework. Its purpose is
 to make sure the data is dealt with correctly in terms of storage, taking into
 account how and when it is used. This includes, for instance, placing medium-

diff --git a/docs/architecture/drops.rst b/docs/architecture/drops.rst
@@ -41,7 +41,7 @@ all events sent by all Drops and make use of them.
 Relationships
 ^^^^^^^^^^^^^
 
-Drops are connected between them and create a graph representing an execution
+Drops are connected and create a dependency graph representing an execution
 plan, where inputs and outputs are connected to applications, establishing the
 following possible relationships:
 
@@ -79,7 +79,7 @@ responsibility of the application to ensure that the I/O is occurring in the
 correct location and using the expected format for storage or subsequent
 upstream processing by other application Drops.
 
-|daliuge| provides various commonly used data Drops with their associated I/O
+|daliuge| provides various commonly used :ref:`data components <data_index>` with their associated I/O
 storage classes, including in-memory, file-base and S3 storages.
 
 .. _drop.channels:

diff --git a/docs/development/app_development.rst b/docs/development/app_development.rst
diff --git a/docs/development/app_development/I_O.rst b/docs/development/app_development/I_O.rst
@@ -0,0 +1,32 @@
+I/O
+===
+
+An application's input and output drops
+are accessed through its
+:class:`inputs <dlg.drop.AppDROP.inputs>` and
+:attr:`outputs <dlg.drop.AppDROP.outputs>` members.
+Both of these are lists of :class:`drops <dlg.drop.AbstractDROP>`,
+and will be sorted in the same order
+in which inputs and outputs
+were defined in the Logical Graph.
+Each element can also be queried
+for its :attr:`uid <dlg.drop.AbstractDROP.uid>`.
+
+Data can be read from input drops,
+and written in output drops.
+To read data from an input drop,
+one calls first the drop's
+:attr:`open <dlg.drop.AbstractDROP.open>` method,
+which returns a descriptor to the opened drop.
+Using this descriptor one can perform successive calls to
+:attr:`read <dlg.drop.AbstractDROP.read>`,
+which will return the data stored in the drop.
+Finally, the drop's
+:attr:`close <dlg.drop.AbstractDROP.close>` method
+should be called
+to ensure that all internal resources are freed.
+
+Writing data into an output drop is similar but simpler.
+Application authors need only call one or more times the
+:attr:`write <dlg.drop.AbstractDROP.write>` method
+with the data that needs to be written.
diff --git a/docs/development/app_development/app_index.rst b/docs/development/app_development/app_index.rst
@@ -0,0 +1,32 @@
+* :ref:`genindex`
+* :ref:`search`
+
+.. _app_index:
+
+|daliuge| Application Component Developers Guide
+################################################
+
+This chapter describes what developers need to do
+to write a new application component that can be used
+as an Application Drop during the execution of a |daliuge| graph.
+
+Detailed instructions can be found in the respective sections forwar
+each type of components. There are also separate sections describing
+integration and testing during component development.
+
+*NOTE: The DCDG is work in progress!*
+
+.. toctree::
+ :maxdepth: 2
+
+ bash_components
+ python_components
+ python_function_components
+ dynlib_components
+ docker_components
+ service_components
+ I_O
+ wrap_existing
+ test_and_debug
+ eagle_integration
+ deployment_testing
diff --git a/docs/development/app_development/bash_advanced.rst b/docs/development/app_development/bash_advanced.rst
@@ -0,0 +1,6 @@
+.. .. _advanced_bash:
+.. 
+.. Advanced Bash Components
+.. ------------------------
+
+TODO
diff --git a/docs/development/app_development/bash_components.rst b/docs/development/app_development/bash_components.rst
@@ -0,0 +1,64 @@
+.. _bash_components:
+
+Bash Components
+===============
+These are probably the easiest components to implement and for simple ones it is possible to do all the 'development' in EAGLE.
+
+'Hello World' in Bash through EAGLE
+-----------------------------------
+Steps
+
+* Open EAGLE (e.g. https://eagle.icrar.org) and create a new graph (Graph --> New --> Create New Graph)
+* Drag and drop a 'Bash Shell App' from the 'All Nodes' palette on the right hand panel onto the canvas.
+* Click on the 'Bash Shell App' title bar in the Inspector tab on the right hand panel. This will open all additional settings below.
+* First change the 'Name' field of the app in the 'Display Options' menu. Call it 'Hello Word'. Once you leave the entry field also the black title bar will reflect that new name.
+* Now change the description of the app in the 'Description' menu. Maybe you write 'Simple Hello World bash app'.
+* now go down to the 'Component Parameters' menu and enter the bash command in the 'Command' field::
+ 
+    echo "Hello World"  
+
+* Now save your new toy graph (Graph --> Local Storage --> Save Graph).
+
+That should give you the idea how to use bash commands as |daliuge| components. Seems not a lot? Well, actually this is allowing you to execute whatever can be executed on the command line where the engine is running as part of a |daliuge| graph. That includes all bash commands, but also every other executable available on the PATH of the engine. Now that is a bit more exciting, but the excitement stops as soon as you think about real world (not Hello World) examples: Really useful commands will require inputs and outputs in the form of command line parameters and files or pipes. This is discussed in the :ref:`advanced_bash` chapter. 
+
+Verification
+------------
+
+Do we believe that this is actually really working? Well, probably not. Thus let's just translate and execute this graph. Note that the graph has neither an input nor an output defined, thus there is nothing you could really expect from running it. However, the |daliuge| engine is pretty verbose when run in debug mode and thus we will use that to investigate what's happening. The following steps are very helpful when it comes to debugging actual components.
+
+Assuming you have a translator and an engine running you can actually translate and execute this, pretty useless, graph. If you have the engine running locally in development mode, you can even see the output in the session log file::
+
+    cd /tmp/dlg/logs
+    ls -ltra dlg_*
+
+The output of the ls command looks like::
+
+    -rw-r--r-- 1 root root 1656 Sep 14 16:46 dlg_172.17.0.3_Diagram-2021-09-14-16-41-283_2021-09-14T08-46-17.341082.log
+    -rw-r--r-- 1 root root 6991 Sep 14 16:46 dlg_172.17.0.3_Diagram-2021-09-14-16-41-284_2021-09-14T08-46-52.618798.log
+    -rw-r--r-- 1 root root 6991 Sep 14 16:47 dlg_172.17.03_Diagram-2021-09-14-16-41-284_2021-09-14T08-47-28.890072.log
+
+There could be a lot more lines on top, but the important one os the last line, which is the log-file of the session last executed on the engine. Just dump the content to the screen in a terminal::
+
+    cat dlg_172.17.03_Diagram-2021-09-14-16-41-284_2021-09-14T08-47-28.890072.log
+
+Since the engine is running in debugging mode there will be many lines in this file, but towards the end you will find something like::
+
+    2021-09-14 08:47:28,912 [ INFO] [      Thread-62] [2021-09-14] dlg.apps.bash_shell_app#_run_bash:217 Finished in 0.006 [s] with exit code 0
+    2021-09-14 08:47:28,912 [DEBUG] [      Thread-62] [2021-09-14] dlg.apps.bash_shell_app#_run_bash:220 Command finished successfully, output follows:
+    ==STDOUT==
+    Hello World
+
+    2021-09-14 08:47:28,912 [DEBUG] [      Thread-62] [2021-09-14] dlg.manager.node_manager#handleEvent:65 AppDrop uid=2021-09-14T08:46:48_-1_0, oid=2021-09-14T08:46:48_-1_0 changed to execState 2
+    2021-09-14 08:47:28,912 [DEBUG] [      Thread-62] [2021-09-14] dlg.manager.node_manager#handleEvent:63 Drop uid=2021-09-14T08:46:48_-1_0, oid=2021-09-14T08:46:48_-1_0 changed to state 2
+
+In addition to the session log file the same information is also contained in the dlgNM.log file in the same directory. That file contains all logs produced by the node manager for all sessions and more, which is usually pretty distracting. However, the name of the session logs are not known before you deploy the session and thus another trick is to monitor the dlgNM.log using the tail command::
+
+    tail -f dlgNM.log
+
+When you now deploy the graph again and watch the terminal output, you will see a lot of messages pass through.
+
+.. _advanced_bash:
+
+Advanced Bash Components
+------------------------
+.. include:: bash_advanced.rst
diff --git a/docs/development/deployment_testing.rst → ...nt/app_development/deployment_testing.rst b/docs/development/deployment_testing.rst → ...nt/app_development/deployment_testing.rst
@@ -2,3 +2,5 @@
 
 Deployment Testing
 ==================
+
+TODO
diff --git a/docs/development/docker_components.rst → ...ent/app_development/docker_components.rst b/docs/development/docker_components.rst → ...ent/app_development/docker_components.rst
@@ -2,3 +2,5 @@
 
 Docker Components
 =================
+
+TODO
diff --git a/docs/development/dynlib_components.rst → ...ent/app_development/dynlib_components.rst b/docs/development/dynlib_components.rst → ...ent/app_development/dynlib_components.rst
@@ -2,3 +2,5 @@
 
 Dynlib Components
 =================
+
+TODO
diff --git a/docs/development/eagle_integration.rst → ...ent/app_development/eagle_integration.rst b/docs/development/eagle_integration.rst → ...ent/app_development/eagle_integration.rst
@@ -1,13 +1,13 @@
 .. _eagle_integration:
 
-Automatic EAGLE Component Description Generation
-------------------------------------------------
+Automatic EAGLE Palette Generation
+----------------------------------
 
-In order to support the direct usage of newly written application components in the EAGLE editor, the |daliuge| system includes a custom set of Doxygen directives and tools. When writing an application component, developers can add specific custom
+In order to support the direct usage of newly written application components in the EAGLE editor, the |daliuge| system supports a custom set of Doxygen directives and tools. When writing an application component, developers can add specific custom
 `Doxygen <https://www.doxygen.nl/>`_ comments to the source code.
 These comments describe the application and can
-be used to automatically generate a DALiuGE component so that the
-application can be used in the *EAGLE* Logical Graph Editor.
+be used to automatically generate a JSON DALiuGE component description
+which can be used in the *EAGLE* Logical Graph Editor.
 
 The comments should be contained within a *EAGLE_START* and *EAGLE_END*
 pair.
@@ -103,5 +103,5 @@ a continuous integration step can then use the tools provided by the |daliuge| s
 The processing will:
 
 * combine the Doxygen output XML into a single XML file
-* transform the XML into an EAGLE palette file
-* push the palette file to the *ICRAR/EAGLE_test_repo* repository.
+* transform the XML into an EAGLE palette file (JSON)
+* push the palette file to a GitHub/GitLab repository (optional).
diff --git a/docs/development/app_development/python_components.rst b/docs/development/app_development/python_components.rst
@@ -0,0 +1,15 @@
+.. default-domain:: py
+
+.. _python_components:
+
+Python Components
+=================
+
+Developers need to write a new python class
+that derives from the :class:`dlg.drop.BarrierAppDROP` class.
+This base class defines all methods and attributes
+that derived class need to function correctly.
+This new class will need a single method
+called :attr:`run <dlg.drop.InputFiredAppDROP.run>`,
+that receives no arguments,
+and executes the logic of the application.
diff --git a/...evelopment/python_function_components.rst → ...evelopment/python_function_components.rst b/...evelopment/python_function_components.rst → ...evelopment/python_function_components.rst
@@ -2,3 +2,5 @@
 
 Python Function Components
 ==========================
+
+TODO
diff --git a/docs/development/service_components.rst → ...nt/app_development/service_components.rst b/docs/development/service_components.rst → ...nt/app_development/service_components.rst
@@ -2,3 +2,5 @@
 
 Service Components
 ==================
+
+TODO
diff --git a/docs/development/test_and_debug.rst → ...opment/app_development/test_and_debug.rst b/docs/development/test_and_debug.rst → ...opment/app_development/test_and_debug.rst
@@ -2,3 +2,5 @@
 
 Test And Debug
 ==============
+
+TODO