Task Assigner initial implementation #343

aleksandr-mokrov · 2022-02-18T16:21:28Z

Example of usage in Jupyter:
Getting registered tasks:

task_keeper = task_interface
tasks = task_keeper.get_registered_tasks()

Creating Random Assigner:

def random_assigner(collaborators, round_number, **kwargs):
    """Assigning task groups randomly while ensuring target distribution"""
    import random
    random.shuffle(collaborators)
    collaborator_task_map = {}
    for idx, col in enumerate(collaborators):
        # select only 30% collaborators for training and validation, 70% for validation
        if (idx+1)/len(collaborators) <= 0.7:
            collaborator_task_map[col] = tasks.values()
        else:
            collaborator_task_map[col] = [tasks['aggregated_model_validate']]
    return collaborator_task_map

or
creating filtering assigner:

def filter_assigner(collaborators, round_number, **kwargs):
    collaborator_task_map = {}
    exclude_collaborators = ['env_two', 'env_three']
    for collaborator in collaborators:
        if collaborator in exclude_collaborators:
            continue
        collaborator_task_map[collaborator] = [
            tasks['train'], 
            tasks['locally_tuned_model_validate'],
            tasks['aggregated_model_validate']
        ]
    return collaborator_task_map

or
creating filtering assigner by gpu info:

shard_registry = federation.get_shard_registry()

def filter_by_shard_registry_assigner(collaborators, round_number, **kwargs):
    collaborator_task_map = {}
    for collaborator in collaborators:
        col_status = shard_registry.get(collaborator)
        if not col_status or not col_status['is_online']:
            continue
        node_info = col_status['shard_info'].node_info
        # Assign train task if collaborator has GPU with total memory more that 8 GB
        if len(node_info.cuda_devices) > 0 and node_info.cuda_devices[0].memory_total > 8 * 1024**3:
            collaborator_task_map[collaborator] = [
                tasks['train'], 
                tasks['locally_tuned_model_validate'],
                tasks['aggregated_model_validate'],
            ]
        else:
            collaborator_task_map[collaborator] = [
                tasks['aggregated_model_validate'],
            ]
    return collaborator_task_map

or
create assigner with additional validation round

rounds_to_train = 3
total_rounds = rounds_to_train + 1 # fl_experiment.start(..., rounds_to_train=total_rounds,...)

def assigner_with_last_round_validation(collaborators, round_number, **kwargs):
    collaborator_task_map = {}
    for collaborator in collaborators:
        if round_number == total_rounds - 1:
            collaborator_task_map[collaborator] = [
                tasks['aggregated_model_validate'],
            ]
        else:
            collaborator_task_map[collaborator] = [
                tasks['train'], 
                tasks['locally_tuned_model_validate'],
                tasks['aggregated_model_validate']
            ]
    return collaborator_task_map

and then pass assigner into start experiment call(task_assigner key):

fl_experiment.start(
    model_provider=model_interface, 
    task_keeper=task_interface,
    data_loader=fed_dataset,
    task_assigner=assigner,
    rounds_to_train=1,
    opt_treatment='CONTINUE_GLOBAL',
    device_assignment_policy='CUDA_PREFERRED'
)

Future opportunities with that assigner function interface:
Filtering by collaborator status (shard_registry)

def filter_by_shard_registry_assigner(collaborators, round_number, **kwargs):
    shard_registry = kwargs.get('shard_registry')
    collaborator_task_map = {}
    for collaborator in collaborators:
        col_status = shard_registry.get(collaborator)
        if not col_status or not col_status['is_online']:
            continue
        collaborator_task_map[collaborator] = [
            tasks['train'], 
            tasks['locally_tuned_model_validate'],
            tasks['aggregated_model_validate']
        ]
    return collaborator_task_map

psfoley · 2022-02-18T23:19:51Z

openfl/component/collaborator/collaborator.py

+
+        # There no enough information to understand that it's train or validate task
+        # Looks like we should provide task type.
+        kwargs = {}


Removing the kwargs resolution from the plan will break the task runner API. Maybe add an if statement for this logic:

if hasattr(self.task_runner, 'TASK_REGISTRY'): func_name = task.function_name # There no enough information to understand that it's train or validate task # Looks like we should provide task type. kwargs = {} if func_name == 'validate': if task.is_local: kwargs['apply'] = 'local' else: kwargs['apply'] = 'global' else: func_name = self.task_config[task]['function'] kwargs = self.task_config[task]['kwargs']

openfl/component/aggregator/aggregator.py

openfl/component/assigner/custom_assigner.py

openfl/component/aggregator/aggregator.py

…rface)

psfoley · 2022-02-25T23:01:29Z

openfl/component/aggregator/aggregator.py

@@ -854,6 +857,7 @@ def _end_of_round_check(self):
        if self._time_to_quit():
            self.logger.info('Experiment Completed. Cleaning up...')
        else:
+            self.round_number += 1


Moving the round increment to the else statement will result in N+1 rounds being executed. This should be changed back

No, in fact difference only in logging. Real checking that it is time-time-quit happens in get_tasks function. I returned it back, but we should rewrite this part of code in the future.

Additional error handling

6662cb6

aleksandr-mokrov marked this pull request as draft February 18, 2022 16:21

aleksandr-mokrov requested review from dmitryagapov, alexey-gruzdev, igor-davidyuk and psfoley February 18, 2022 16:21

psfoley reviewed Feb 18, 2022

View reviewed changes

openfl/component/aggregator/aggregator.py Outdated Show resolved Hide resolved

psfoley reviewed Feb 18, 2022

View reviewed changes

openfl/component/assigner/custom_assigner.py Outdated Show resolved Hide resolved

psfoley reviewed Feb 19, 2022

View reviewed changes

openfl/component/aggregator/aggregator.py Outdated Show resolved Hide resolved

aleksandr-mokrov added 3 commits February 24, 2022 02:26

Getting registered tasks from task registry (task keeper or task inte…

164e815

…rface)

Excluding non-compatible tests

f90d3d9

Aggregation function by task name

b1f8b7f

alexey-gruzdev linked an issue Feb 24, 2022 that may be closed by this pull request

Task Assigner Entity #124

Closed

alexey-gruzdev added the enhancement New feature or request label Feb 24, 2022

aleksandr-mokrov added 8 commits February 24, 2022 19:34

Backward compatibility (Task Runner API flow)

cca0bb0

Redundant logging

9be8b35

Backward compatibility

f72c991

Backward compatibility

4711699

Backward compatibility

7c9d4dc

Backward compatibility

2ba9231

Backward compatibility

34267a1

Backward compatibility

98902f2

aleksandr-mokrov marked this pull request as ready for review February 24, 2022 21:46

aleksandr-mokrov requested a review from psfoley February 25, 2022 08:47

dmitryagapov approved these changes Feb 25, 2022

View reviewed changes

Documentation

3a51478

psfoley reviewed Feb 25, 2022

View reviewed changes

Return round incrementation

11e1663

alexey-gruzdev changed the title ~~New assigner sketch~~ Task Assigner initial implementation Feb 28, 2022

igor-davidyuk approved these changes Feb 28, 2022

View reviewed changes

alexey-gruzdev approved these changes Feb 28, 2022

View reviewed changes

alexey-gruzdev added this to the v1.3 milestone Feb 28, 2022

alexey-gruzdev merged commit 9dcf0d4 into securefederatedai:develop Feb 28, 2022

github-actions bot locked and limited conversation to collaborators Feb 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Task Assigner initial implementation #343

Task Assigner initial implementation #343

aleksandr-mokrov commented Feb 18, 2022 •

edited by dmitryagapov

psfoley Feb 18, 2022

aleksandr-mokrov Feb 23, 2022

psfoley Feb 25, 2022 •

edited

aleksandr-mokrov Feb 26, 2022

Task Assigner initial implementation #343

Task Assigner initial implementation #343

Conversation

aleksandr-mokrov commented Feb 18, 2022 • edited by dmitryagapov

psfoley Feb 18, 2022

Choose a reason for hiding this comment

aleksandr-mokrov Feb 23, 2022

Choose a reason for hiding this comment

psfoley Feb 25, 2022 • edited

Choose a reason for hiding this comment

aleksandr-mokrov Feb 26, 2022

Choose a reason for hiding this comment

aleksandr-mokrov commented Feb 18, 2022 •

edited by dmitryagapov

psfoley Feb 25, 2022 •

edited