Add dask support #21

PythonFZ · 2023-02-09T08:18:38Z

Open this PR:

Install via pip install znflow[dask] to deploy the ZnFlow graph using Dask either locally or on the supported dask workers.

# Conflicts: # .github/workflows/pytest.yaml # poetry.lock # pyproject.toml

PythonFZ · 2023-02-10T10:36:27Z

Check large graphs:

import dataclasses
import znflow
import random
import time

@dataclasses.dataclass
class Node(znflow.Node):
    inputs: float
    outputs: float = None
    
    def run(self):
        time.sleep(0.1)
        self.outputs = self.inputs * 2

@dataclasses.dataclass
class SumNodes(znflow.Node):
    inputs: float
    outputs: float = None

    def run(self):
        time.sleep(0.1)
        self.outputs = sum(self.inputs)

k = 5
j = 5
i = 5

with znflow.DiGraph() as graph:
    kdx_nodes = []
    for kdx in range(k):
        jdx_nodes = []
        for jdx in range(j):
            idx_nodes = []
            for idx in range(i):
                idx_nodes.append(Node(inputs=random.random()))
            jdx_nodes.append(SumNodes(inputs=[x.outputs for x in idx_nodes]))
        kdx_nodes.append(SumNodes(inputs=[x.outputs for x in jdx_nodes]))
            
    end_node = SumNodes(inputs=[x.outputs for x in kdx_nodes])

deployment = znflow.deployment.Deployment(graph=graph, client=client)
deployment.submit_graph()
deployment.get_results(graph.nodes)

SamTov

This looks really nice. Just a general comment on Dask at a large scale. The issue we came across was that Dask will deploy as many jobs as possible to a single task which will mean if you set your cluster parameters such that you want 5 GB of memory because that is how much on job requires, it will violate it immediately as it just gives this amount of memory to one worker who can run as many jobs as possible.

The way I am trying to avoid this is to assign worker resources which limit the number of tasks it can deploy, e.g. 1 GPU per model training means only one model can run on the worker at a time. Or, for espresso, 1 "espresso". But in the case you have here, if you task was a large matrix computation and you heavily parallelised over nodes, I think you would reach a dead worker pretty fast.

README.md

PythonFZ · 2023-03-24T15:29:37Z

Check large graphs:

import dataclasses
import znflow
import random
import time

@dataclasses.dataclass
class Node(znflow.Node):
    inputs: float
    outputs: float = None
    
    def run(self):
        time.sleep(0.1)
        self.outputs = self.inputs * 2

@dataclasses.dataclass
class SumNodes(znflow.Node):
    inputs: float
    outputs: float = None

    def run(self):
        time.sleep(0.1)
        self.outputs = sum(self.inputs)

k = 5
j = 5
i = 5

with znflow.DiGraph() as graph:
    kdx_nodes = []
    for kdx in range(k):
        jdx_nodes = []
        for jdx in range(j):
            idx_nodes = []
            for idx in range(i):
                idx_nodes.append(Node(inputs=random.random()))
            jdx_nodes.append(SumNodes(inputs=[x.outputs for x in idx_nodes]))
        kdx_nodes.append(SumNodes(inputs=[x.outputs for x in jdx_nodes]))
            
    end_node = SumNodes(inputs=[x.outputs for x in kdx_nodes])

deployment = znflow.deployment.Deployment(graph=graph, client=client)
deployment.submit_graph()
deployment.get_results(graph.nodes)

Not a Problem anymore. I can run 3600 Nodes with dask in ~ 20 s

PythonFZ added 7 commits February 9, 2023 08:00

adding dask support

c4a7567

improve dask support

3cf6bbb

add IterableHandler.updated

2f8dd55

support NodeView, remove support for graph

387b3f2

add TODO

d86d498

install all extras

2a603e8

fix + add tests for deployment

8bf3b73

PythonFZ linked an issue Feb 9, 2023 that may be closed by this pull request

Dask support #8

Closed

PythonFZ added 6 commits February 9, 2023 09:30

update README.md

dfabcfb

move bokeh to extras

b4bf816

update documentation

36c99ca

update documentation

e770872

Merge branch 'main' into 8-dask-support

7b069a7

# Conflicts: # .github/workflows/pytest.yaml # poetry.lock # pyproject.toml

poetry update

1753ded

SamTov reviewed Feb 10, 2023

View reviewed changes

README.md Outdated Show resolved Hide resolved

PythonFZ added 3 commits February 14, 2023 14:28

Merge remote-tracking branch 'origin/main' into 8-dask-support

591e4ca

merge main

cbc4b00

isort

9ca97df

PythonFZ added 2 commits March 24, 2023 16:34

typo

d9c0023

bump version

f4dbf1f

PythonFZ mentioned this pull request Mar 24, 2023

Dask worker resources #69

Open

PythonFZ merged commit 0169cd7 into main Mar 24, 2023

PythonFZ deleted the 8-dask-support branch March 24, 2023 15:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add dask support #21

Add dask support #21

PythonFZ commented Feb 9, 2023 •

edited

Loading

PythonFZ commented Feb 10, 2023

SamTov left a comment

PythonFZ commented Mar 24, 2023

Add dask support #21

Add dask support #21

Conversation

PythonFZ commented Feb 9, 2023 • edited Loading

PythonFZ commented Feb 10, 2023

SamTov left a comment

Choose a reason for hiding this comment

PythonFZ commented Mar 24, 2023

PythonFZ commented Feb 9, 2023 •

edited

Loading