Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs for the director-envoy 1.2 update #154

Merged
merged 54 commits into from Aug 31, 2021
Merged

Conversation

igor-davidyuk
Copy link
Contributor

No description provided.

@igor-davidyuk igor-davidyuk added the documentation Improvements or additions to documentation label Aug 17, 2021
@igor-davidyuk igor-davidyuk added this to the v1.2 milestone Aug 17, 2021
@igor-davidyuk igor-davidyuk self-assigned this Aug 17, 2021
@alexey-gruzdev alexey-gruzdev marked this pull request as ready for review August 19, 2021 07:34
@alexey-gruzdev
Copy link
Contributor

ok to test

=========

|productName| comes with another long-existing actor called Envoy. It runs on collaborator machines connected to a *Director*.
There is one to one mapping between *Envoys* and Dataset shards: every *Envoy* needs exactly one *Shard Descriptor* to run.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the ontology between the Envoy and Datasets very clear. It might also be helpful to provide a link to the ShardDescriptor here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the place, where ShardDescriptor is explained, is director-based workflow, I can place a link to that section.
Alternatively, we could create another section under OpenFL structure that would describe the interface entities for collaborator nodes owners / managers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I added a link to the ShardDescriptor base class

=============

Collaborator is also a short living entity, it manages training the model on local data: executes assigned tasks, converts DL framework-specific tensor objects to |productName| inner representation, and exchanges model parameters with the aggregator.
Converting tensors is done by Framework adapter plugins. |productName| ships with Pytorch and TensorFlow 2 framework adapters, this list will be extended in the future. User is free to implement their adapter for the required DL framework enabling |productName| support for experiments using this framework. The adapter plugin interface is simple: there are two required methods to load and extract tensors from a model and optimizer. Model is loaded with relevant weights before every task and at the end of the training task, weights are extracted to be sent to the central node and aggregated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"|productName| ships with Pytorch and Tensorflow 2.x framework adapters. These framework adapters are intended to be extensible, and we encourage users to contribute new adapters for DL frameworks they would like to see supported in |productName|."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the Framework Adapter documented elsewhere? The content is good here, but we should add a dedicated section to the documentation that walks through the plugin interface

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accepted proposed changes and added a dedicated section for plugin components.
https://openfl.readthedocs.io/en/docs-director-envoy/source/openfl/plugins.html

@itrushkin
Copy link
Contributor

Please merge the latest changes from develop branch in order to pass new CI tests.


|productName| comes with another long-existing actor called *Envoy*. It runs on collaborator machines connected to a *Director*.
There is one to one mapping between *Envoys* and Dataset shards: every *Envoy* needs exactly one
`Shard Descriptor <https://github.com/intel/openfl/blob/develop/openfl/interface/interactive_api/shard_descriptor.py>`_ to run.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

===========

The *Aggregator* is a short-living entity, which means that its lifespan is limited by experiment execution time.
It orchestrates *Collaborators* according to the FL plan and performs model updates aggregation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"...performs model aggregation at the end of each round".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accepted

@staticmethod
def set_tensor_dict(model, tensor_dict, optimizer=None, device='cpu') -> None:

Implement :code:`serialization_setup` optional method if some preparation is required before the model serialization.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"If your new framework model cannot be directly serialized with pickle-type libraries, you can optionally implement the :code:serialization_setup method to prepare the model object for serialization."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced, thank you


If you have trusted workspace and connection should not be encrypted you can use :code:`disable_tls` option while starting experiment.
Otherwise it is necessary to certify each node participating in the federation. Certificates allow to use mutual tls connection between nodes.
You can certify nodes by your own pki system or use pki provided by OpenFL. It is based on `step-ca <https://github.com/smallstep/certificates>`_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PKI should be capitalized everywhere except shell commands (i.e. fx pki ...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@alexey-gruzdev alexey-gruzdev merged commit 7f94704 into develop Aug 31, 2021
@github-actions github-actions bot locked and limited conversation to collaborators Aug 31, 2021
@alexey-gruzdev alexey-gruzdev deleted the docs-director-envoy branch August 31, 2021 12:36
@igor-davidyuk igor-davidyuk restored the docs-director-envoy branch August 31, 2021 15:07
@alexey-gruzdev alexey-gruzdev deleted the docs-director-envoy branch September 2, 2021 14:17
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants