# Tour of Sky
Sky is a tool to run any workload seamlessly across different cloud providers through a unified interface. No knowledge of cloud offerings is required or expected - you simply define the workload and its resource requirements, and Sky will automatically execute it on AWS, Google Cloud Platform or Microsoft Azure.

### Key Features
- **Run your code on the cloud with zero code changes**
- **Easy provisionioning of VMs** across multiple cloud platforms (AWS, Azure or GCP)
- **Fast and iterative development** with quick access to cloud instances for prototyping. If cloud is unavailable, a local development mode is also available
- **Store your datasets on the cloud** and access them like you would on a local filesystem
- **No cloud lock-in** - easily move your code from Azure GPUs to Google TPUs with one line change

In [29]:
# TODO: make this pip install sky
# pip install -e ..
import os
import tempfile

## Hello, Sky!
We can specify the following task attributes with a YAML file:
- `resources` (optional): what cloud resources the task must be run on (e.g. accelerators, instance type, etc.)
- `setup` (optional): commands that must be run before the task is executed
- `run`: specifies the commands that must be run as the actual ask

In [42]:
hello_sky_spec = """
resources:
 cloud: aws
 accelerators: K80

setup: |
 echo "running setup"

run: |
 echo "hello sky!"
 ping localhost -c 5
"""

with tempfile.NamedTemporaryFile('w', prefix='sky_tour-', delete=False) as fp:
    fp.write(hello_sky_spec)
    fp.flush()
    os.environ["SKY_SPEC"] = fp.name

Sky handles selecting an appropriate cluster size based on user-specified resource constraints, launching the cluster on an appropriate cloud provider, and executing the task. 

To launch a task based on our above YAML spec, we can use `sky launch`. The `-c` option allows us to specify a cluster name. If a cluster with that name already exists (which can be viewed with `sky status`), Sky will re-use that cluster. If no such cluster exists, a new cluster with that name will be provisioned. If no cluster name is provided (e.g. `sky launch task.yaml`), then a cluster name will be autogenerated.

In [44]:
!sky launch -c mycluster $SKY_SPEC

[33mFile Mount: (/tmp/setup.sh:/tmp/sky_setup_2521886668.sh) refers to a file.
 To ensure this mount updates properly, please use a directory.[39m
2022-01-13 23:10:31,931	INFO util.py:282 -- setting max workers for head node type to 0
Checking AWS environment settings
Destroying cluster. [4mConfirm [y/N]:[24m y [2m[automatic, due to --yes][22m
[33mFile Mount: (/tmp/setup.sh:/tmp/sky_setup_2521886668.sh) refers to a file.
 To ensure this mount updates properly, please use a directory.[39m
2022-01-13 23:10:33,504	INFO util.py:282 -- setting max workers for head node type to 0
[33mLoaded cached provider configuration[39m
[33mIf you experience issues with the cloud provider, try re-running the command with [1m--no-config-cache[22m[26m.[39m
[37mFetched IP[39m: [1m18.220.133.95[22m
[32mStopped all 14 Ray processes.[39m
[0mShared connection to 18.220.133.95 closed.
Requested [1m1[22m[26m nodes to shut down.[0m[2m [interval=1s][22m[0m
[1m0[22m[26m nodes remainin

We can see above that our task executed by printing "hello sky!" and by pinging localhost 5 times. We can also view `sky status` to see that a new cluster named `mycluster` was created on AWS.

In [45]:
!sky status

Sky Clusters
+-----------------+-------------+--------------------+------------------------------------------------+--------+
|       NAME      |   LAUNCHED  |     RESOURCES      | COMMAND                                        | STATUS |
+-----------------+-------------+--------------------+------------------------------------------------+--------+
|   test-github   |  17 hrs ago | 1x AWS(m4.2xlarge) | sky cpunode -c test-github                     |   UP   |
| sky-bbbd-ubuntu | 13 mins ago | 1x AWS(m4.2xlarge) | sky launch /tmp/sky_tour-h_z1hu1m              |   UP   |
|    mycluster    | 51 secs ago | 1x AWS(p2.xlarge)  | sky launch -c mycluster /tmp/sky_tour-u76bp__4 |   UP   |
+-----------------+-------------+--------------------+------------------------------------------------+--------+
[0m

## Training a Language Model on TPU

In [59]:
lm_spec = """
resources:
  accelerators: tpu-v3-8
  accelerator_args:
    tf_version: 2.5.0

setup: |
  pip install --upgrade pip

  conda activate huggingface
  if [ $? -eq 0 ]; then
    echo 'conda env exists'
  else
    conda create -n huggingface python=3.8 -y
    conda activate huggingface
    pip install -r requirements.txt
  fi

run: |
  conda activate huggingface
  python -u run_tpu.py
"""

with tempfile.NamedTemporaryFile('w', prefix='sky_tour-', delete=False) as fp:
    fp.write(lm_spec)
    fp.flush()
    os.environ["SKY_SPEC"] = fp.name

In [62]:
!sky launch -c sky-b1d1-ubuntu $SKY_SPEC

[34mDetected YAML file: /tmp/sky_tour-5gxbi87d[0m
I 01-14 00:04:41 resources.py:56] Missing tpu_name in accelerator_args, using default (sky_tpu)
[33mRunning task on cluster sky-b1d1-ubuntu ...[0m
I 01-14 00:04:41 execution.py:83] Optimizer target is set to COST.
I 01-14 00:04:41 optimizer.py:208] Defaulting estimated time to 1 hr. Call Task.set_time_estimator() to override.
I 01-14 00:04:41 optimizer.py:307] Optimizer - plan minimizing cost (~$8.5):
I 01-14 00:04:41 optimizer.py:321] 
I 01-14 00:04:41 optimizer.py:321] TASK                                                                                                         BEST_RESOURCE
I 01-14 00:04:41 optimizer.py:321] Task(run='conda activate huggi...')                                                                          GCP(n1-highmem-8, {'tpu-v3-8': 1}, accelerator_args={'tf_version': '2.5.0', 'tpu_name': 'sky_tpu'})
I 01-14 00:04:41 optimizer.py:321]   resources: {None(None, {'tpu-v3-8': 1}, accelerator_args={'tf_vers

## Interative Development
Sky also supports iterative development with the interactive nodes. Currently, GPU, CPU, and TPU (Google Cloud only) nodes are supported. They can be configured with a variety of options and support port forwarding, tmux, screen, and spot instances. We also enable native ssh support for all clusters for debugging and interactive development purposes.

Try the following in your terminal!
```
sky gpunode -c mygpu --gpus K80 --cloud gcp
```

If you would like to temporarily stop a cluster, use `sky stop`.

In [52]:
!sky stop mycluster

[33mFile Mount: (/tmp/setup.sh:/tmp/sky_setup_2521886668.sh) refers to a file.
 To ensure this mount updates properly, please use a directory.[39m
2022-01-13 23:37:35,689	INFO util.py:282 -- setting max workers for head node type to 0
[33mLoaded cached provider configuration[39m
[33mIf you experience issues with the cloud provider, try re-running the command with [1m--no-config-cache[22m[26m.[39m
Destroying cluster. [4mConfirm [y/N]:[24m y [2m[automatic, due to --yes][22m
[33mFile Mount: (/tmp/setup.sh:/tmp/sky_setup_2521886668.sh) refers to a file.
 To ensure this mount updates properly, please use a directory.[39m
2022-01-13 23:37:35,777	INFO util.py:282 -- setting max workers for head node type to 0
[37mFetched IP[39m: [1m18.214.89.219[22m
[32mStopped all 14 Ray processes.[39m
[0mShared connection to 18.214.89.219 closed.
Stopping instances [1mi-0a9ae425f8260dff1[22m [2m(to terminate instead, set `cache_stopped_nodes: False` under `provider` in the cluster c

In [56]:
!sky status

Sky Clusters
+-----------------+-------------+--------------------+-----------------------------------+--------+
|       NAME      |   LAUNCHED  |     RESOURCES      | COMMAND                           | STATUS |
+-----------------+-------------+--------------------+-----------------------------------+--------+
|   test-github   |  18 hrs ago | 1x AWS(m4.2xlarge) | sky cpunode -c test-github        |   UP   |
| sky-bbbd-ubuntu | 43 mins ago | 1x AWS(m4.2xlarge) | sky launch /tmp/sky_tour-h_z1hu1m |   UP   |
|      mycpu      |  9 mins ago | 1x AWS(m4.2xlarge) | sky cpunode -c mycpu              |   UP   |
|    mycluster    | 19 secs ago | 1x AWS(p2.xlarge)  | sky start mycluster               |   UP   |
+-----------------+-------------+--------------------+-----------------------------------+--------+
[0m

To restart the cluster, use `sky start`.

In [54]:
!sky start mycluster

[1mStarting cluster mycluster...[0m
I 01-13 23:41:55 cloud_vm_ray_backend.py:624] To view detailed progress: [1mtail -n100 -f sky_logs/sky-2022-01-13-23-41-55-979318/provision.log[0m
I 01-13 23:41:55 cloud_vm_ray_backend.py:634] 
I 01-13 23:41:55 cloud_vm_ray_backend.py:634] [1mLaunching on AWS us-east-1 (us-east-1a,us-east-1b,us-east-1c,us-east-1d,us-east-1e,us-east-1f)[0m
Shared connection to 54.146.90.55 closed.
Shared connection to 54.146.90.55 closed.
Shared connection to 54.146.90.55 closed.
Shared connection to 54.146.90.55 closed.
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.org/simple, https://pip.repos.neuron.amazonaws.com
Shared connection to 54.146.90.55 closed.
You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command.[0m
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.org/simple, https://pip.repos.neuro

In [55]:
!sky status

Sky Clusters
+-----------------+----------------+--------------------+-----------------------------------+--------+
|       NAME      |    LAUNCHED    |     RESOURCES      | COMMAND                           | STATUS |
+-----------------+----------------+--------------------+-----------------------------------+--------+
|   test-github   |   18 hrs ago   | 1x AWS(m4.2xlarge) | sky cpunode -c test-github        |   UP   |
| sky-bbbd-ubuntu |  43 mins ago   | 1x AWS(m4.2xlarge) | sky launch /tmp/sky_tour-h_z1hu1m |   UP   |
|      mycpu      |   9 mins ago   | 1x AWS(m4.2xlarge) | sky cpunode -c mycpu              |   UP   |
|    mycluster    | a few secs ago | 1x AWS(p2.xlarge)  | sky start mycluster               |   UP   |
+-----------------+----------------+--------------------+-----------------------------------+--------+
[0m

## Sky Storage

In [61]:
# Should probably get its own set of docs. 
# A bit hard to show in jupyter walkthrough unless we have local mounts?

## Python API
For advanced use cases, we allow users to automate worfklows with Sky using our Python API.

In [None]:
import sky

backend = sky.backends.CloudVmRayBackend()

with sky.Dag() as dag:
    resources = sky.Resources(cloud=sky.AWS(), accelerators={'V100': 1})
    setup_commands = 'echo "Hello, Sky!"'
    task = sky.Task(run='ping 127.0.0.1 -c 5',
                    setup=setup_commands,
                    name='ping').set_resources(resources)

sky.launch(dag, backend=backend)

I 01-14 00:07:15 execution.py:83] Optimizer target is set to COST.
I 01-14 00:07:15 optimizer.py:208] Defaulting estimated time to 1 hr. Call Task.set_time_estimator() to override.
I 01-14 00:07:15 optimizer.py:307] Optimizer - plan minimizing cost (~$3.1):
I 01-14 00:07:15 optimizer.py:321] 
I 01-14 00:07:15 optimizer.py:321] TASK    BEST_RESOURCE
I 01-14 00:07:15 optimizer.py:321] ping    AWS(p3.2xlarge)
I 01-14 00:07:15 optimizer.py:321] 
I 01-14 00:07:15 cloud_vm_ray_backend.py:993] [36mCreating a new cluster: "sky-d83a-ubuntu" [1x AWS(p3.2xlarge)].[0m
I 01-14 00:07:15 cloud_vm_ray_backend.py:993] Tip: to reuse an existing cluster, specify --cluster-name (-c) in the CLI or use sky.launch(.., cluster_name=..) in the Python API. Run `sky status` to see existing clusters.
I 01-14 00:07:15 cloud_vm_ray_backend.py:624] To view detailed progress: [1mtail -n100 -f sky_logs/sky-2022-01-14-00-07-15-794515/provision.log[0m
I 01-14 00:07:15 cloud_vm_ray_backend.py:634] 
I 01-14 00:07:15 c

#### Check out our example YAML specs and Python scripts under `prototype/examples/` to get started! 