-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infrastructure for GPU devices #162
Infrastructure for GPU devices #162
Conversation
0b34e32
to
1d5035f
Compare
resp.health_check_period.seconds = health_check_period | ||
|
||
return resp | ||
|
||
async def GetEnvoys(self, request, context): # NOQA:N802 | ||
"""Get a status information about envoys.""" | ||
envoy_infos = self.director.get_envoys() | ||
|
||
return director_pb2.GetEnvoysResponse(envoy_infos=envoy_infos) | ||
response = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a response. It's envoy_infos or envoy_info_messages. Response is director_pb2.GetEnvoysResponse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to envoy_statuses
openfl/interface/envoy.py
Outdated
root_certificate, private_key, certificate): | ||
"""Start the Envoy.""" | ||
logger.info('🧿 Starting the Envoy.') | ||
|
||
shard_descriptor = shard_descriptor_from_config(shard_config_path) | ||
# Reed the Envoy config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Read
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it looks like we can add a method read_envoy_config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue that we need a separate builder component, that would assemble our services and manage their plugins. For now, I would just leave the config reading logic in the cl interface
ok to test |
added a message to director.proto with gpu info
Co-authored-by: Ilya Trushkin <ilya.trushkin@intel.com>
fb290d9
to
b5702e0
Compare
ok to test |
An attempt to allow assigning GPU devices through OpenFL.
The PR introduces an optional 'device monitor' plugin for Envoy and two information flows:
The Director_Pytorch_Kvasir_UNET example is modified to utilize the new infrastructure.
There are 2 envoys, one that utilizes GPUs and one that does not.
Device assignment for an experiment is done through 'device assignment policy' which may be 'cuda preferred' or 'cpu only'