Docs: add a hint to customizing spot controller. #2753

concretevitamin · 2023-11-03T02:23:10Z

This has come up quite a few times so adding a hint.

Tested (run the relevant ones):

Code formatting: bash format.sh
Any manual or new tests for this PR (please specify below): rendered locally
All smoke tests: pytest tests/test_smoke.py
Relevant individual smoke tests: pytest tests/test_smoke.py::test_fill_in_the_name
Backward compatibility tests: bash tests/backward_comaptibility_tests.sh

cblmemo

LGTM! Left one suggestion 🫡

cblmemo · 2023-11-05T04:45:04Z

docs/source/examples/spot-jobs.rst

        disk_size: 100

 The :code:`resources` field has the same spec as a normal SkyPilot job; see `here <https://skypilot.readthedocs.io/en/latest/reference/yaml-spec.html>`__.
+
+.. note::
+  These settings will not take effect if you have an existing controller (either


Shall we add a note to say that changing the config but not terminate the spot controller could lead to a failure?
Related: #2703 (for context of this PR, I'm waiting for Zhanghao's opinion)

Probably should add such a warning as part of #2703, since right now it doesn't error out. Wdyt?

Now it still errors out due to sky.exceptions.ResourcesMismatchError, though arguably the error message is informative, so it should be fine to leave to #2703 ; )

sky spot launch whoami Task from command: whoami Managed spot job 'sky-cmd' will be launched on (estimated): I 11-05 10:56:23 optimizer.py:674] == Optimizer == I 11-05 10:56:23 optimizer.py:686] Target: minimizing cost I 11-05 10:56:23 optimizer.py:697] Estimated cost: $0.0 / hour I 11-05 10:56:23 optimizer.py:697] I 11-05 10:56:23 optimizer.py:770] Considered resources (1 node): I 11-05 10:56:23 optimizer.py:818] ---------------------------------------------------------------------------------------------------------------- I 11-05 10:56:23 optimizer.py:818] CLOUD INSTANCE vCPUs Mem(GB) ACCELERATORS REGION/ZONE COST ($) CHOSEN I 11-05 10:56:23 optimizer.py:818] ---------------------------------------------------------------------------------------------------------------- I 11-05 10:56:23 optimizer.py:818] GCP n2-standard-8[Spot] 8 32 - northamerica-northeast2-a 0.04 ✔ I 11-05 10:56:23 optimizer.py:818] AWS m6i.2xlarge[Spot] 8 32 - eu-north-1c 0.12 I 11-05 10:56:23 optimizer.py:818] ---------------------------------------------------------------------------------------------------------------- I 11-05 10:56:23 optimizer.py:818] Launching the spot job 'sky-cmd'. Proceed? [Y/n]: Launching managed spot job 'sky-cmd' from spot controller... Launching spot controller... sky.exceptions.ResourcesMismatchError: Requested resources do not match the existing cluster. Requested: 1x AWS(cpus=4, disk_size=50) Existing: 1x GCP(n2-standard-4, disk_size=50) To fix: specify a new cluster name, or down the existing cluster first: sky down sky-spot-controller-402b1bba

Yep let's leave it there. I think the situation that same cloud but smaller controller is more common.

cblmemo · 2023-11-05T04:47:47Z

docs/source/examples/spot-jobs.rst

-3. Changing the disk_size of the spot controller to store more logs. (Default: 50GB)
+1. Use a lower-cost controller (if you have a low number of concurrent spot jobs).
+2. Enforcing the spot controller to run on a specific location. (Default: cheapest location)
+3. Changing the maximum number of spot jobs that can be run concurrently, which is 2x the vCPUs of the controller. (Default: 16)


A not related question: where is this logic implemented? I've briefly go through the source code but doesn't found any logic to pending the job if the number of jobs is too much.

Oh nvm.. Just notice that it is because our default job takes 0.5 CPUs

cyril94440 · 2024-02-21T15:50:44Z

I only have one service that doesn't need autoscaling, what would be the lowest controller config possible?

I am not able to get anything lower than m6i.xlarge even though I changed the configuration and made sure no controller was live.

Thanks

cblmemo · 2024-02-22T14:49:27Z

I only have one service that doesn't need autoscaling, what would be the lowest controller config possible?

I am not able to get anything lower than m6i.xlarge even though I changed the configuration and made sure no controller was live.

Thanks

Hi @cyril94440 ! Thanks for your interest in SkyServe. Could you try the following config?

serve:
  controller:
    resources:
      cpus: 2

cyril94440 · 2024-02-23T21:16:18Z

I only have one service that doesn't need autoscaling, what would be the lowest controller config possible?
I am not able to get anything lower than m6i.xlarge even though I changed the configuration and made sure no controller was live.
Thanks

Hi @cyril94440 ! Thanks for your interest in SkyServe. Could you try the following config?
serve:
  controller:
    resources:
      cpus: 2

I tried that without any success, still an m6i.large unfortunately

Michaelvll · 2024-02-23T21:17:44Z

I only have one service that doesn't need autoscaling, what would be the lowest controller config possible?
I am not able to get anything lower than m6i.xlarge even though I changed the configuration and made sure no controller was live.
Thanks

Hi @cyril94440 ! Thanks for your interest in SkyServe. Could you try the following config?
serve:
  controller:
    resources:
      cpus: 2
I tried that without any success, still an m6i.large unfortunately

Hey @cyril94440, you may have to sky down sky-serve-controller-<hash> manually, before the config will take effect : )

cyril94440 · 2024-03-12T18:04:18Z

I only have one service that doesn't need autoscaling, what would be the lowest controller config possible?
I am not able to get anything lower than m6i.xlarge even though I changed the configuration and made sure no controller was live.
Thanks

Hi @cyril94440 ! Thanks for your interest in SkyServe. Could you try the following config?
serve:
  controller:
    resources:
      cpus: 2
I tried that without any success, still an m6i.large unfortunately
Hey @cyril94440, you may have to sky down sky-serve-controller-<hash> manually, before the config will take effect : )

Thank you very much. It is working with the "serve:" directive.

Is this config the lowest possible? Or can we use a t3 or t4g instance as a controller?.

Thank you

Michaelvll · 2024-03-13T19:33:53Z

I only have one service that doesn't need autoscaling, what would be the lowest controller config possible?
I am not able to get anything lower than m6i.xlarge even though I changed the configuration and made sure no controller was live.
Thanks

Hi @cyril94440 ! Thanks for your interest in SkyServe. Could you try the following config?
serve:
  controller:
    resources:
      cpus: 2
I tried that without any success, still an m6i.large unfortunately
Hey @cyril94440, you may have to sky down sky-serve-controller-<hash> manually, before the config will take effect : )
Thank you very much. It is working with the "serve:" directive.

Is this config the lowest possible? Or can we use a t3 or t4g instance as a controller?.

Thank you

We would recommend to have at least 2GB memory with intel or amd cpus for a controller that serves only a single service, so t3.small might be good.

Docs: add a hint to customizing spot controller.

6c6c4e5

concretevitamin requested review from Michaelvll and cblmemo November 4, 2023 20:23

cblmemo approved these changes Nov 5, 2023

View reviewed changes

concretevitamin merged commit d80c47b into master Nov 6, 2023
18 checks passed

concretevitamin deleted the ctrl-docs branch November 6, 2023 02:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docs: add a hint to customizing spot controller. #2753

Docs: add a hint to customizing spot controller. #2753

concretevitamin commented Nov 3, 2023

cblmemo left a comment

cblmemo Nov 5, 2023

concretevitamin Nov 5, 2023

cblmemo Nov 5, 2023

concretevitamin Nov 6, 2023

cblmemo Nov 5, 2023

cblmemo Nov 5, 2023

cyril94440 commented Feb 21, 2024

cblmemo commented Feb 22, 2024

cyril94440 commented Feb 23, 2024

Michaelvll commented Feb 23, 2024 •

edited

cyril94440 commented Mar 12, 2024

Michaelvll commented Mar 13, 2024

Docs: add a hint to customizing spot controller. #2753

Docs: add a hint to customizing spot controller. #2753

Conversation

concretevitamin commented Nov 3, 2023

cblmemo left a comment

Choose a reason for hiding this comment

cblmemo Nov 5, 2023

Choose a reason for hiding this comment

concretevitamin Nov 5, 2023

Choose a reason for hiding this comment

cblmemo Nov 5, 2023

Choose a reason for hiding this comment

concretevitamin Nov 6, 2023

Choose a reason for hiding this comment

cblmemo Nov 5, 2023

Choose a reason for hiding this comment

cblmemo Nov 5, 2023

Choose a reason for hiding this comment

cyril94440 commented Feb 21, 2024

cblmemo commented Feb 22, 2024

cyril94440 commented Feb 23, 2024

Michaelvll commented Feb 23, 2024 • edited

cyril94440 commented Mar 12, 2024

Michaelvll commented Mar 13, 2024

Michaelvll commented Feb 23, 2024 •

edited