Liu 367 #235

awicenec · 2023-07-05T12:49:27Z

Thus PR covers the full application of named ports for application components, rather than just Pyfunc components. It is now possible to connect any component with a port having the same name as one of the application arguments or component parameters and the engine will connect the value with the input or output. The component will still need to read or write. The best example is the updated version of dlg.apps.simple.SleepApp. There is now also a test related to that, based on a new example graph. In addition also the HelloWorld_Universe graph has been updated to run without the addition of the example components and has also been added to the tests, because there was no test for the gather component at all and the gather component broke with the first version of this update. We now have a few simple functions in dlg.apps.simple_functions, which are mostly used for testing, but are also useful to show how things are being done when using functions.
An InputFiredAppDROP will now first run it's internal _run() method and then execute the run() method provided by the App component implementation. The checking and population of the named ports is in that _run() method and thus applies to any component derived from the BarrierAppDROP(InputFiredAppDROP).

Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>

Calling random.randint(1e6) is deprecated, as 1e6 is a float, not an integer. Likewise, threading.Event.isSet is deprecated in favour of is_set, which we were using in most places anyway. Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>

There is no need to re-compile these regular expressions each time we create a drop, so this should save us some memory and CPU. Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>

This object is a simple dataclass that holds the information that DropProxy objects need to perform their duties. This is the *static* information -- the DropProxy still requires a RPCClient to actually interact with a remote RPC server. This little utility class encapsulates some of the behavior that was previously found in the dynlib module, where we setup proxies for newly-created processes. While this isn't a great change in itself, it prepares the codebase for a much bigger change: the introduction of subprocesses for the effective execution of drop apps. Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>

Drops stored a reference to the session that contained them in the _dlg_session attribute. This attribute was then later used to extract the ID of such session via *its* sessionID attribute. The existence of this ID is then seen in a number of places across the code, causing a number of effects (setting environment variables, logging the session ID, etc). A closer inspection to the code revealed that the sessionID attribute was the only attribute ever read from the drop._dlg_session attribute. Thus, storing the full session object is unnecessary. While during normal usage this doesn't matter much, there is a negative effect on serialisation of drops, which cannot be achieved because the Session object bound to one of its attributes isn't serialisable (it holds not only a lock, but also a reference to the Node Manager, which contains open file descriptors, thread pools, and more). This commit removes the internal _dlg_session attribute from the AbstractDROP class, and replaces it with a _dlg_session_id. To make things easier overall we default its value to an empty string, both when the drops are created directly (e.g., MyDrop()) and when they are constructed from the graph_loader module. Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>

Similarly to how each drop doesn't need full access to its session, but only needs to know about the session ID, full access to the RPC server hosting the drop isn't necessary either, only its endpoint. The latter is used to create drop proxies on newly spawned processes running app drops so they can contact their inputs/outputs. This is currently needed by the dynlib module, which does such spawning, but we want to move to a multiprocessing world where most (if not all) app drops execute in separate processes. This commit removes the _rpc_server attribute injected by the Session object into each drop, and referencing the full NodeManager, and replaces it with a simpler _rpc_endpoint that simply contains the (host,port) tuple needed to contact the RPC server. Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>

… missing dropclass param. Minor changes.

By extracting this into an ABC we can change the implementation (by default it will occur synchronously, but in the context of a NodeManager, it will execute on a ThreadPool). This sets the stage for an implementation that utilises Processes for true parrallelisation. If a drop is asynchronously executed a seperate, daemon thread is created to wait for execution to finish. This job shouldn't be run on the same pool as the DropRunner as it is easy to deadlock by running out of executors in the implementing pool (e.g. a ThreadPool with max_executors=1, async_execute() is submitted, takes up the single Thread prior to the actual run() method being submitted to the pool. In this case run() will never execute as there is never a free thread and async_execute() blocks forever). Note that the daemon thread will not be terminated until the process exits, so this causes a memory leak and will need to be addressed in future.

So each drop will run on seperate processes for true parallelism. Mirror the ThreadDropRunner tests to ensure the functionality is the same across both implementations.

As we can't rely on an external service for unit tests. Update the NGAS host to one that will be maintained for manual tests when required.

Looks like a later version of pydantic is stricter about whether values are required. A `Union[int, None]` without an initialised value of `None` is no longer valid. To fix this, we simply initialise with `None`.

rtobar and others added 30 commits May 23, 2023 13:04

Remove unnecessary file

770862b

Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>

Avoid deprecated python features

41c5b4c

Calling random.randint(1e6) is deprecated, as 1e6 is a float, not an integer. Likewise, threading.Event.isSet is deprecated in favour of is_set, which we were using in most places anyway. Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>

Make RE objects static members of AbstractDROP

986e024

There is no need to re-compile these regular expressions each time we create a drop, so this should save us some memory and CPU. Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>

updated default for numpy load to allow pickle

201ca2a

Fixed bug reading input port in sleepApp

69a1289

Added named ports to APP components

7b02b40

Fixed failing test

a85de26

Added missing import

f00747d

Don't read from ports, but leave that to the app

635c154

Changed doxygen for PyFunc

5bb47ff

A few more fixes for docker deployments

76bd920

Fixed test for new dropclass

4fee9e6

Combine dummy input and output ports into one InputOutput port. Add a…

1e3c241

… missing dropclass param. Minor changes.

Attempt to use dlg_paletteGen instead of xml2palette.py

f14808a

Minor cleanup

5ffea37

Update dixygen to newest format

ff5e86d

Add dummy input port to Branch component

96730b5

Re-named the tag and digest params on the Docker drop

8ac9b41

Combine dummy input and output ports into one InputOutput port. Add a…

909fc05

… missing dropclass param. Minor changes.

Attempt to use dlg_paletteGen instead of xml2palette.py

e4a200b

Update dixygen to newest format

4fb6805

Add dummy input port to Branch component

8a0079e

Re-named the tag and digest params on the Docker drop

4424765

Combine dummy input and output ports into one InputOutput port. Add a…

217bba8

… missing dropclass param. Minor changes.

Attempt to use dlg_paletteGen instead of xml2palette.py

ee55380

Update dixygen to newest format

5d2d4b3

Add dummy input port to Branch component

9221a88

awicenec and others added 29 commits November 7, 2023 15:37

Adjusted for OOD

6f2a955

Implement ProcessDropRunner

a0d746e

So each drop will run on seperate processes for true parallelism. Mirror the ThreadDropRunner tests to ensure the functionality is the same across both implementations.

Skip NGAS tests

2094405

As we can't rely on an external service for unit tests. Update the NGAS host to one that will be maintained for manual tests when required.

Fix lgweb errors

8b97854

Looks like a later version of pydantic is stricter about whether values are required. A `Union[int, None]` without an initialised value of `None` is no longer valid. To fix this, we simply initialise with `None`.

Fixed pydantic, string2json docstring

3bdf3bd

Fixed pydata and tests

21ba3e8

Fixed parset drop and added formats

170bfac

Updated paramset description

f493b76

Added benedict to docs build

537540c

Fixed test

585ff99

Small documentation change

44bf360

Fixed parset tests

d1240d4

Merge branch 'multiproc-drop-apps-julian' into liu-367

014d2cc

Chasing test errors

f9b6ac8

Still chasing

54db33d

Finally tracked down the problem. Mereg complete!

ff883fa

Added parser params to docstrings

3a86977

dropclass update for InMemory drop

d6a45c4

Bumped minor version

991c8db

daliuge 3.1.0

9d8d612

updated worflows

8567cd3

Timeout for delayed requests

fa39900

Fixed dlm test

d0b974d

Fixed versions of zeroconf and zmq

f0a8b82

PYTHONPATH fix for tests

3899bb5

Fixed typo in workflow

590e0d8

More path fixes

dd8219d

Cleanup

ba77770

awicenec merged commit bc75eef into master Mar 18, 2024
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Liu 367 #235

Liu 367 #235

awicenec commented Jul 5, 2023 •

edited

Loading

Liu 367 #235

Liu 367 #235

Conversation

awicenec commented Jul 5, 2023 • edited Loading

awicenec commented Jul 5, 2023 •

edited

Loading