<p style="text-align:center;">
    <img src="https://raw.githubusercontent.com/skypilot-org/skypilot/master/docs/source/images/skypilot-wide-light-1k.png" width=500>
</p>

# Welcome to SkyPilot! üëã

SkyPilot is a framework for easily running machine learning workloads on any cloud. 

Use the clouds **easily** and **cost effectively**, without needing cloud infra expertise.

_Ease of use_
* **Run existing projects on the cloud** with zero code changes
* Use a **unified interface** to run on any cloud, without vendor lock-in (currently AWS, Azure, GCP)
* **Queue jobs** on one or multiple clusters
* **Automatic failover** to find scarce resources (GPUs) across regions and clouds
* **Use datasets on the cloud** like you would on a local file system 

_Cost saving_
* Run jobs on **spot instances** with **automatic recovery** from preemptions
* Hands-free cluster management: **automatically stopping idle clusters**
* One-click use of **TPUs**, for high-performance, cost-effective training
* Automatically benchmark and find the cheapest hardware for your job

# Learning outcomes üéØ

After completing this notebook, you will be able to:

1. Understand the basic SkyPilot YAML interface (`setup`, `run`).
2. Run a hello world task on a cloud of your choice.
3. SSH into your cluster for debugging and development.
4. Terminate the cluster and understand the cluster lifecycle.
5. Run your task seamlessly across different clouds.

# How to use this Tutorial

These notebooks serve as an **interactive** introduction to SkyPilot.

There are points in these notebooks where you may need to edit files outside the notebook and open a terminal to run some commands. These points will be highlighted with **two icons**:

### <span style="color:green">[DIY]</span> üìù - Edit an external file
### <span style="color:green">[DIY]</span> üíª - Run commands in an interactive terminal window

Use these icons as a hint to know when to switch away from the current notebook and edit a file or open a terminal.

> **üí° Hint** - If you're using jupyter lab, you can create a terminal in your browser by going to `File -> New -> Terminal`

# Preflight checks - verifying cloud credential setup

Before we start this tutorial, let's run `sky check` to make sure your credentials are correctly setup.

After running the below cell, you should have AWS and GCP clouds marked as `enabled`. 

> **üí° Hint** - If you don't see any clouds enabled, please refer to the [SkyPilot docs](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html#cloud-account-setup) on how to setup your cloud accounts.

> **üí° Hint** - SkyPilot also supports Azure! Though it is not used in this tutorial, please check out our [docs](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html#cloud-account-setup) on how to setup Azure support.

In [None]:
# Run this cell to check if your cloud accounts are setup to work with SkyPilot
! sky check

# Writing your first SkyPilot Task

A **task** in SkyPilot specifies the command that must be run on the cloud, along with the resources required (e.g. GPUs, TPUs, number of nodes) and any dependencies (e.g., files, packages and libraries).

Tasks in SkyPilot are defined as YAML files. Here is an example:

-----------------------------------
```yaml
# example.yaml
name: example

setup: |
  echo "Run any setup commands here"
  pip install cowsay

run: |
  echo "Hello Stranger!"
  cowsay "Moo!"
```
----------------------------------- 

This defines a task with the following components:

* **setup**: commands that must be run before the task is executed. Here we install any dependencies for the task.

* **run**: commands that run the actual task.

## <span style="color:green">[DIY]</span> üìù Edit `example.yaml` to echo "Hello SkyPilot" 
**Go ahead and open example.yaml and edit the run field to echo "Hello SkyPilot".**

# Launching your first SkyPilot Task with `sky launch`

Once your task YAML is ready, you can run it on the cloud with `sky launch`.

## <span style="color:green">[DIY]</span> üíª Launch your Sky Task!

**In a terminal window, run:**

-------------------------
```console
sky launch 01_hello_sky/example.yaml
```
-------------------------

This will take about a minute to run.

> **üí° Hint** - If you're using jupyter lab, you can create a terminal in your browser by going to `File -> New -> Terminal`

You'll notice that SkyPilot will perform multiple actions for you:
#### **1. Find the lowest priced VM instance type across different clouds**

SkyPilot will run its optimizer and present you with the cheapest VM type that fits your resource demand.

```console
$ sky launch example.yaml
(base) romilb@romilbx1yoga:skypilot-tutorial/01_hello_sky$ sky launch example.yaml 
Task from YAML spec: example.yaml
I 09-07 16:24:59 optimizer.py:605] == Optimizer ==
I 09-07 16:24:59 optimizer.py:617] Target: minimizing cost
I 09-07 16:24:59 optimizer.py:628] Estimated cost: $0.4 / hour
I 09-07 16:24:59 optimizer.py:628] 
I 09-07 16:24:59 optimizer.py:685] Considered resources (1 node):
I 09-07 16:24:59 optimizer.py:713] ---------------------------------------------------------------------
I 09-07 16:24:59 optimizer.py:713]  CLOUD   INSTANCE         vCPUs   ACCELERATORS   COST ($)   CHOSEN   
I 09-07 16:24:59 optimizer.py:713] ---------------------------------------------------------------------
I 09-07 16:24:59 optimizer.py:713]  AWS     m6i.2xlarge      8       -              0.38          ‚úî     
I 09-07 16:24:59 optimizer.py:713]  Azure   Standard_D8_v4   8       -              0.38                
I 09-07 16:24:59 optimizer.py:713]  GCP     n1-highmem-8     8       -              0.47                
I 09-07 16:24:59 optimizer.py:713] ---------------------------------------------------------------------
I 09-07 16:24:59 optimizer.py:713] 
Launching a new cluster 'sky-82ce-romilb'. Proceed? [Y/n]: Y
```

#### **2. Provision the cluster**

SkyPilot will setup a cluster with the requested resources and setup a SSH profile for it.


#### **3. Run the task's `setup` commands to prepare the cluster for running the task**

SkyPilot will run any commands specified in the `setup` field in the YAML on the VMs in the cluster. In this case, it will install the `cowsay` package.


#### **4. Run the task's `run` commands**

Finally, SkyPilot will run the commands specified in the `run` field. These commands can use any dependencies installed in the `setup` phase.

> ```console
(example pid=23346) Hello SkyPilot!
(example pid=23346)  ______
(example pid=23346) < Moo! >
(example pid=23346)  ------
(example pid=23346)         \   ^__^
(example pid=23346)          \  (oo)\_______
(example pid=23346)             (__)\       )\/\
(example pid=23346)                 ||----w |
(example pid=23346)                 ||     ||
INFO: Job finished (status: SUCCEEDED).
```

# Tasks and Clusters in SkyPilot

**Tasks** in SkyPilot are executed on **clusters**. A **cluster** is a collection of nodes on a cloud.

When you run a task with `sky launch`, SkyPilot creates a new cluster with a random name if an existing cluster is not specified.

> **üí° Hint** - When running `sky launch`, you can give the cluster a name with the `-c` flag. E.g. `sky launch -c mycluster example.yaml` would launch a cluster with the name `mycluster`. If the cluster name already exists, then SkyPilot will try to reuse the cluster by re-running the `setup` commands on the cluster.

You can see a table of your clusters with the command `sky status`.

## <span style="color:green">[DIY]</span> üíª Checking your cluster status with `sky status`

**In a terminal window, run:**


-------------------------
```console
sky status
```
-------------------------

### Expected output
-------------------------
```console
(base) romilb@romilbx1yoga:skypilot-tutorial/01_hello_sky$ sky status

NAME             LAUNCHED     RESOURCES            STATUS  AUTOSTOP  COMMAND                  
sky-82ce-romilb  19 mins ago  1x AWS(m6i.2xlarge)  UP      -         sky launch example.yaml  
```
-------------------------

We can see that the `sky launch` in the previous cells created a cluster with the name `sky-82ce-romilb`.

## <span style="color:green">[DIY]</span> üíª SSH into the cluster!

For debugging and development, you can easily SSH into a SkyPilot cluster with the `ssh` utility. 

**In a terminal window, run:**

-------------------------
```console
ssh <cluster-name>
```
-------------------------

### Expected output

This will drop you into an interactive terminal inside your cluster:

-------------------------
```console
(base) romilb@romilbx1yoga:skypilot-tutorial/01_hello_sky$ ssh sky-82ce-romilb 
Warning: Permanently added '18.234.228.139' (ECDSA) to the list of known hosts.
=============================================================================
       __|  __|_  )
       _|  (     /   Deep Learning AMI GPU PyTorch 1.10.0 (Ubuntu 20.04)
      ___|\___|___|
=============================================================================

Welcome to Ubuntu 20.04.4 LTS (GNU/Linux 5.13.0-1014-aws x86_64v)

Last login: Wed Sep  7 23:27:50 2022 from 24.23.130.196
ubuntu@ip-172-31-33-58:~$ echo $HOSTNAME
ip-172-31-33-58
```
-------------------------

You can use `ctrl+d` to exit from the SSH session.

> **üí° Hint** - To enable the SSH functionality, SkyPilot adds the remote cluster to your `~/.ssh/config`. This means you can use the cluster name alias with other ssh tools, such as `scp`, `rsync`, VSCode and more!

# Cluster lifecycle management

SkyPilot clusters can exist in four states, each of which has different billing and storage implications:

* **`INIT`** - Cluster is initializing.
* **`UP`** - Cluster is up and running, you will be billed for the instance and the attached storages.
* **`STOPPED`** - Cluster nodes are shut down and their disks are suspended. Your data and node state is safe and the cluster can be restored to running state when required. You will be billed only for the storage.
* **`TERMINATED`** - Cluster is terminated and all nodes and their attached disks are deleted. These clusters cannot be restarted and will not be shown in `sky status`.

To manage these states, SkyPilot offers three useful commands:

1. **`sky stop`** - stops a `UP` cluster.
2. **`sky start`** - starts a `STOPPED` cluster.
2. **`sky down`** - terminates a `UP` or `STOPPED` cluster.

> **üí° Hint** - `sky stop` and `sky start` are useful when you want to suspend your experiments for a while but want to quickly resume later. `sky down` is useful to delete a cluster and restart a job from scratch.

## <span style="color:green">[DIY]</span> üíª Terminate your cluster!
Now that we are done using the cluster, let's terminate it to stop being billed for it. You can use `sky down` to terminate a cluster.

**First, get the cluster name with `sky status`.**

-------------------------
```console
$ sky status
```
-------------------------

**and then run `sky down` to terminate the cluster**

-------------------------
```console
$ sky down <cluster-name>
```
-------------------------

### Expected output

-------------------------
```console
(base) romilb@romilbx1yoga:skypilot-tutorial/01_hello_sky$ sky down sky-82ce-romilb
Terminating 1 cluster: sky-82ce-romilb. Proceed? [Y/n]: Y
Terminating cluster sky-82ce-romilb...done.
Terminating 1 cluster ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ 100% 0:00:00
```
-------------------------

# Switching clouds with just one line change

One of the key benefits of using SkyPilot is the ability to seamlessly switch between different clouds for running your tasks.

You may have noticed the previous task was launched on AWS because it was cheaper than GCP. However, if we wish to use a specific cloud, we can override the optimizer by using the `--cloud` flag.

**Let's launch the same task on Google Cloud (GCP).**


## <span style="color:green">[DIY]</span> üíª Launch example.yaml on google cloud with with the `--cloud` flag

To override the SkyPilot optimizer and manually pick a cloud, use the `--cloud <cloud>` flag for `sky launch`.

**Go ahead and run the task on GCP using `--cloud gcp` flag.**

-------------------------
```console
sky launch 01_hello_sky/example.yaml --cloud gcp
```
-------------------------

This will take about a minute.

### Expected output

You'll note that SkyPilot only considers GCP as a possible resource now. This is because the `--cloud` sets a hard constraint on the optimizer to use only GCP. 


--------------------------
```console
(base) romilb@romilbx1yoga:skypilot-tutorial/01_hello_sky$ sky launch example.yaml --cloud gcp
Task from YAML spec: example.yaml
I 10-16 08:41:14 optimizer.py:605] == Optimizer ==
I 10-16 08:41:14 optimizer.py:628] Estimated cost: $0.5 / hour
I 10-16 08:41:14 optimizer.py:628] 
I 10-16 08:41:14 optimizer.py:685] Considered resources (1 node):
I 10-16 08:41:14 optimizer.py:713] -------------------------------------------------------------------
I 10-16 08:41:14 optimizer.py:713]  CLOUD   INSTANCE       vCPUs   ACCELERATORS   COST ($)   CHOSEN   
I 10-16 08:41:14 optimizer.py:713] -------------------------------------------------------------------
I 10-16 08:41:14 optimizer.py:713]  GCP     n1-highmem-8   8       -              0.47          ‚úî     
I 10-16 08:41:14 optimizer.py:713] -------------------------------------------------------------------
I 10-16 08:41:14 optimizer.py:713] 
Launching a new cluster 'sky-e2fc-romilb'. Proceed? [Y/n]: 
```
--------------------------

## <span style="color:green">[DIY]</span> üíª Terminate your GCP cluster!
We're at the end of this notebook and we don't want to let your GCP cluster keep running and rack up a big bill! Let's terminate the cluster with `sky down`.

**First, get the cluster name with `sky status`.**

-------------------------
```console
sky status
```
-------------------------

**and then run `sky down` to terminate the cluster**

-------------------------
```console
sky down <cluster-name>
```
-------------------------

#### üéâ Congratulations! You have used SkyPilot to seamlessly run tasks on two clouds! Please proceed to the next notebook to learn how to use accelerators and object stores in SkyPilot.
