-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Provisioner] Check ports for cloud not specified cases #3139
Conversation
For discussion: How about adding OPEN_PORTS to requested features in |
sky/execution.py
Outdated
@@ -261,6 +261,9 @@ def _execute( | |||
# NOTE: in general we may not have sufficiently specified info | |||
# (cloud/resource) to check STOP_SPOT_INSTANCE here. This is checked in | |||
# the backend. | |||
if any(r.ports is not None for r in task.resources): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember if task.resources
being a set means any_of
is being used. (We should add a comment to Task.resources.) In that case, we can have any_of { ..requesting ports.., ..not requesting ports.. }
, and the logic here will effectively disallow the second resource.
Is the above right? Wdyt is the best way to handle this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah you are right... And it seems like we don't need the handle here since we will add resources-specific cloud required features and remove any related cloud for each resources in current master:
Lines 341 to 353 in 2f0432b
resources_required_features = resources.get_required_cloud_features() | |
if num_nodes > 1: | |
resources_required_features.add( | |
CloudImplementationFeatures.MULTI_NODE) | |
try: | |
self.check_features_are_supported(resources, | |
resources_required_features) | |
except exceptions.NotSupportedError: | |
# TODO(zhwu): The resources are now silently filtered out. We | |
# should have some logging telling the user why the resources | |
# are not considered. | |
return ([], []) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually here we should only add task-level features like stop and multi-node
sky/resources.py
Outdated
else: | ||
at_least_one_cloud_supports_ports = False | ||
for cloud in global_user_state.get_enabled_clouds(): | ||
try: | ||
cloud.check_features_are_supported( | ||
self, {clouds.CloudImplementationFeatures.OPEN_PORTS}) | ||
at_least_one_cloud_supports_ports = True | ||
except exceptions.NotSupportedError: | ||
pass | ||
if not at_least_one_cloud_supports_ports: | ||
with ux_utils.print_exception_no_traceback(): | ||
raise ValueError('Cannot specify ports without enabling ' | ||
'a cloud that supports open ports.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic in execution.py should've caught this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for better UX. The current master is already disabling the resources with ports if only cloud that does not support ports are enabled; but it will print some message like in #3125:
I 02-08 22:42:32 optimizer.py:1219] No resource satisfying <Cloud>(ports=['8080']) on RunPod.
sky.exceptions.ResourcesUnavailableError: Catalog does not contain any instances satisfying the request:
Task(run='conda activate vllm\...')
resources: <Cloud>(ports=['8080']).
Co-authored-by: Zongheng Yang <zongheng.y@gmail.com>
As mentioned in #3125, previously SkyPilot would not print correct ports not supported message if the resources has no cloud specified, and all enabled cloud has no ports support. This PR fixes the problem.
Tested (run the relevant ones):
bash format.sh
sky check
; thensky serve up
the YAML described in Deployment Issue with serving Model on RunPod #3125pytest tests/test_smoke.py
pytest tests/test_smoke.py::test_fill_in_the_name
bash tests/backward_comaptibility_tests.sh