<p style="text-align:center;">
    <img src="https://raw.githubusercontent.com/skypilot-org/skypilot/master/docs/source/images/skypilot-wide-light-1k.png" width=500>
</p>

# Welcome to SkyPilot! 👋

SkyPilot is a framework for running LLMs, AI, and batch jobs on any cloud, offering maximum cost savings, highest GPU availability, managed execution, and multi-region serving.

Use the clouds **easily** and **cost effectively**, without needing cloud infra expertise.

_Ease of use_
* **Run existing projects on the cloud** with zero code changes
* Use a **unified interface** to run on any cloud, without vendor lock-in (currently AWS, Azure, GCP)
* **Queue jobs** on one or multiple clusters
* **Automatic failover** to find scarce resources (GPUs) across regions and clouds
* **Use datasets on the cloud** like you would on a local file system
* **One-click deployment** of models for inference

_Cost saving_
* Run jobs on **spot instances** with **automatic recovery** from preemptions
* Run GenAI Serving on **spot instances** for high availability at low cost
* Hands-free cluster management: **automatically stopping idle clusters**
* One-click use of **GPUs**, for high-performance, cost-effective training
* Automatically benchmark and find the cheapest hardware for your job

# Learning outcomes 🎯

After completing this notebook, you will be able to:

1. Understand the basic SkyPilot YAML interface (`setup`, `run`).
2. Run a hello world task on a cloud of your choice.
3. SSH into your cluster for debugging and development.
4. Terminate the cluster and understand the cluster lifecycle.
5. Run your task seamlessly across different clouds.

# How to use this Tutorial

These notebooks serve as an **interactive** introduction to SkyPilot.

There are points in these notebooks where you may need to edit files outside the notebook and open a terminal to run some commands. These points will be highlighted with **two icons**:

### <span style="color:green">[DIY]</span> 📝 - Edit an external file

> **💡 Hint** - Remember to save your file after making any changes!

### <span style="color:green">[DIY]</span> 💻 - Run commands in an interactive terminal window

Use these icons as a hint to know when to switch away from the current notebook and edit a file or open a terminal.

> **💡 Hint** - If you're using jupyter lab, you can create a terminal in your browser by going to `File -> New -> Terminal`

# Preflight checks - verifying cloud credential setup

Before we start this tutorial, let's run `sky check` to make sure your credentials are correctly setup.

After running the below cell, you should have AWS and GCP clouds marked as `enabled`. 

> **💡 Hint** - If you don't see any clouds enabled, please refer to the [SkyPilot docs](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html#cloud-account-setup) on how to setup your cloud accounts.

> **💡 Hint** - SkyPilot also supports Azure, Lambda, RunPod and ~12s of public cloud! Though it is not used in this tutorial, please check out our [docs](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html#cloud-account-setup) on how to setup other cloud account.

In [None]:
# Run this cell to check if your cloud accounts are setup to work with SkyPilot
! sky check

# Writing your first SkyPilot Task

A **task** in SkyPilot specifies the command that must be run on the cloud, along with the resources required (e.g. GPUs, TPUs, number of nodes) and any dependencies (e.g., files, packages and libraries).

Tasks in SkyPilot are defined as YAML files. Here is an example:

-----------------------------------
```yaml
# example.yaml
name: example

setup: |
  echo "Run any setup commands here"
  pip install cowsay

run: |
  echo "Hello Stranger!"
  cowsay -t "Moo! SkyPilot!"
```
----------------------------------- 

This defines a task with the following components:

* **setup**: commands that must be run before the task is executed. Here we install any dependencies for the task.

* **run**: commands that run the actual task.

# Launching your first SkyPilot Task with `sky launch`

Once your task YAML is ready, you can run it on the cloud with `sky launch`.

## <span style="color:green">[DIY]</span> 💻 Launch your Sky Task!

**In a terminal window, run:**

-------------------------
```console
sky launch example.yaml -c hello-sky
```
-------------------------

This will take about a minute to run.

> **💡 Hint** - If you're using jupyter lab, you can create a terminal in your browser by going to `File -> New -> Terminal`

You'll notice that SkyPilot will perform multiple actions for you:
#### **1. Find the lowest priced VM instance type across different clouds**

SkyPilot will run its optimizer and present you with the cheapest VM type that fits your resource demand.

```console
(base) root@b42d8750f97a:/skycamp-tutorial/01_hello_sky# sky launch example.yaml -c hello-sky
Task from YAML spec: example.yaml
Considered resources (1 node):
--------------------------------------------------------------------------------------------------------------------------------------------------
 CLOUD        INSTANCE        vCPUs   Mem(GB)   ACCELERATORS   REGION/ZONE                                                    COST ($)   CHOSEN   
--------------------------------------------------------------------------------------------------------------------------------------------------
 Kubernetes   2CPU--2GB       2       2         -              gke_skycamp-skypilot-fastchat_us-central1-c_skycamp-gke-test   0.00          ✔     
 GCP          n2-standard-8   8       32        -              us-central1-a                                                  0.39                
--------------------------------------------------------------------------------------------------------------------------------------------------
Launching a new cluster 'hello-sky'. Proceed? [Y/n]:
```

#### **2. Provision the cluster**

SkyPilot will setup a cluster with the requested resources and setup a SSH profile for it.


#### **3. Run the task's `setup` commands to prepare the cluster for running the task**

SkyPilot will run any commands specified in the `setup` field in the YAML on the VMs in the cluster. In this case, it will install the `cowsay` package.


#### **4. Run the task's `run` commands**

Finally, SkyPilot will run the commands specified in the `run` field. These commands can use any dependencies installed in the `setup` phase.

```console
(example, pid=1487) Hello SkyPilot!
(example, pid=1487)   ______________
(example, pid=1487) | Moo! SkyPilot! |
(example, pid=1487)   ==============
(example, pid=1487)               \
(example, pid=1487)                \
(example, pid=1487)                  ^__^
(example, pid=1487)                  (oo)\_______
(example, pid=1487)                  (__)\       )\/\
(example, pid=1487)                      ||----w |
(example, pid=1487)                      ||     ||
✓ Job finished (status: SUCCEEDED).
```

# Tasks and Clusters in SkyPilot

**Tasks** in SkyPilot are executed on **clusters**. A **cluster** is a collection of nodes on a cloud.

When you run a task with `sky launch`, SkyPilot creates a new cluster with a random name if an existing cluster is not specified.

> **💡 Hint** - When running `sky launch`, you can give the cluster a name with the `-c` flag. E.g. `sky launch -c mycluster example.yaml` would launch a cluster with the name `mycluster`. If the cluster name already exists, then SkyPilot will try to reuse the cluster by re-running the `setup` commands on the cluster.

You can see a table of your clusters with the command `sky status`.

## <span style="color:green">[DIY]</span> 💻 Checking your cluster status with `sky status`

**In a terminal window, run:**


-------------------------
```console
sky status
```
-------------------------

### Expected output
-------------------------
```console
(base) root@b42d8750f97a:/skycamp-tutorial/01_hello_sky# sky status
Clusters
NAME       LAUNCHED   RESOURCES                 STATUS  AUTOSTOP  COMMAND                        
hello-sky  1 min ago  1x Kubernetes(2CPU--2GB)  UP      -         sky launch example.yaml -c...  

Managed jobs
No in-progress managed jobs. (See: sky jobs -h)

Services
No live services. (See: sky serve -h)
```
-------------------------

We can see that the `sky launch` in the previous cells created a cluster with the name `sky-78b8-root`.

## <span style="color:green">[DIY]</span> 💻 SSH into the cluster!

For debugging and development, you can easily SSH into a SkyPilot cluster with the `ssh` utility. 

**In a terminal window, run:**

-------------------------
```console
ssh hello-sky
```
-------------------------

### Expected output

This will drop you into an interactive terminal inside your cluster:

-------------------------
```console
(base) root@b42d8750f97a:/skycamp-tutorial/01_hello_sky# ssh hello-sky
Warning: Permanently added '127.0.0.1' (ECDSA) to the list of known hosts.
Warning: Permanently added '10.12.0.13' (ECDSA) to the list of known hosts.
Linux hello-sky-3851-head 6.1.100+ #1 SMP PREEMPT_DYNAMIC Sat Aug 24 16:19:44 UTC 2024 x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
(base) sky@hello-sky-3851-head:~$ hostname
hello-sky-3851-head
```
-------------------------

You can use `ctrl+d` to exit from the SSH session.

> **💡 Hint** - To enable the SSH functionality, SkyPilot adds the remote cluster to your `~/.ssh/config`. This means you can use the cluster name alias with other ssh tools, such as `scp`, `rsync`, VSCode and more!

# Cluster lifecycle management

SkyPilot clusters can exist in four states, each of which has different billing and storage implications:

* **`INIT`** - Cluster is initializing.
* **`UP`** - Cluster is up and running, you will be billed for the instance and the attached storages.
* **`STOPPED`** - Cluster nodes are shut down and their disks are suspended. Your data and node state is safe and the cluster can be restored to running state when required. You will be billed only for the storage.
* **`TERMINATED`** - Cluster is terminated and all nodes and their attached disks are deleted. These clusters cannot be restarted and will not be shown in `sky status`.

To manage these states, SkyPilot offers several useful commands:

1. **`sky stop`** - stops a `UP` cluster.
2. **`sky start`** - starts a `STOPPED` cluster.
2. **`sky down`** - terminates a `UP` or `STOPPED` cluster.
2. **`sky autostop`** - sets a cluster to automatically stop after a period of inactivity.

> **💡 Hint** - `sky stop` and `sky start` are useful when you want to suspend your experiments for a while but want to quickly resume later. `sky down` is useful to delete a cluster and restart a job from scratch.

## <span style="color:green">[DIY]</span> 💻 Terminate your cluster!
Now that we are done using the cluster, let's terminate it to stop being billed for it. You can use `sky down` to terminate a cluster.

**Run `sky down` to terminate the cluster**

-------------------------
```console
sky down hello-sky
```
-------------------------



### Expected output

-------------------------
```console
(base) root@b42d8750f97a:/skycamp-tutorial/01_hello_sky# sky down hello-sky
Terminating 1 cluster: hello-sky. Proceed? [Y/n]: Y
Terminating cluster hello-sky...done.
Terminating 1 cluster ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
```
-------------------------

#### 🎉 Congratulations! You have used SkyPilot to seamlessly run tasks on the cloud! Please proceed to the next notebook to learn how to use accelerators and object stores in SkyPilot.
