<p style="text-align:center;">
    <img src="https://raw.githubusercontent.com/skyplane-project/skyplane/main/docs/_static/logo-light-mode.png" width=500>
</p>

# Welcome to Skyplane!

## Skyplane enables fast data transfers between any cloud

Skyplane is a tool for blazingly fast bulk data transfers between object stores in the cloud. It provisions a fleet of VMs in the cloud to transfer data in parallel while using compression and bandwidth tiering to reduce cost.

Skyplane is:
1. 🔥 Blazing fast ([110x faster than AWS DataSync](https://skyplane.org/en/latest/benchmark.html))
2. 🤑 Cheap (4x cheaper than rsync)
3. 🌐 Universal (AWS, Azure, GCP, IBMCloud, and Cloudflare R2)

You can use Skyplane to transfer data: 
* between object stores within a cloud provider (e.g. AWS us-east-1 to AWS us-west-2)
* between object stores across multiple cloud providers (e.g. AWS us-east-1 to GCP us-central1)
* between local storage and cloud object stores (experimental)

# Exercises

This notebook consists of 4 exercises:

1. Exercise 1: Copying data between AWS regions
2. Exercise 2: Copying data with Skyplane 
3. Exercise 3: Copying data to multiple destinations (multicast) 
4. Exercise 4: Copying data between two VMs 
5. Exercise 5: Cleanup 

# Learning outcomes 🎯

After completing this notebook, you would have:

1. An understand the Skyplane API
2. Transfered data for a ML model from AWS S3 object stores in US-EAST-1 (N. Virginia) to EU-WEST-1 (Ireland)
3. Compare and contrast `aws s3 cp` with `skyplane cp`
4. Terminate the transfer and clean up state



# How to use this Tutorial

These notebooks serve as a guide to Skyplane. At any point if you happen to get stuck, feel free to ping us on `#skyplane` channel on the [Skycamp slack.](https://join.slack.com/t/skycamp2022/shared_invite/zt-1gsrgky1z-iSFVEEOMSUD7Dd7B5syCsA)

We will describe what we are doing in this notebook. The commands and the example response are included. We highly recommend you open an terminal and run commands yourself. 

### 💻 - Run commands in an interactive terminal window

You can use this icon as a hint to know when to switch away from the current notebook and edit a file or open a terminal. We also have example outputs that you can use to ensure consistency. 


# How to open a Terminal

If you're using jupyter lab, you can create a terminal in your browser by going to `File -> New -> Terminal`

# Preflight  - Initializing cloud credentials

Before we start this tutorial, we have few pre-flight checks:

### Let's ensure we have the latest notebook

In [None]:
# Please run this cell
!git pull --quiet

### Update to the latest Skyplane pip package

In [None]:
!pip uninstall -y skyplane

In [None]:
# Please run this cell
!pip install -U "git+https://github.com/skyplane-project/skyplane.git@skycamp-tutorial-2023#egg=skyplane[aws]"

In [None]:
!pip install ipywidgets

### Configure Skyplane with AWS credentials. 

#### <span style="color:red">Choose `Y` only for AWS, and `n` for GCP and Azure.</span>

💻 `skyplane init`

```
 _____ _   ____   _______ _       ___   _   _  _____ 
/  ___| | / /\ \ / / ___ \ |     / _ \ | \ | ||  ___|
\ `--.| |/ /  \ V /| |_/ / |    / /_\ \|  \| || |__  
 `--. \    \   \ / |  __/| |    |  _  || . ` ||  __| 
/\__/ / |\  \  | | | |   | |____| | | || |\  || |___ 
\____/\_| \_/  \_/ \_|   \_____/\_| |_/\_| \_/\____/

03:37:54 [DEBUG] Found existing configuration file at /root/.skyplane/config, 
loading

(1) Configuring AWS:
    Do you want to configure AWS support in Skyplane? [Y/n]:
    Loaded AWS credentials from the AWS CLI [IAM access key ID: ...ZEXYJW]
    AWS region config file saved to /root/.skyplane/aws_config

(2) Configuring Azure:
    Do you want to configure Azure support in Skyplane? [Y/n]: n
    Disabling Azure support

(3) Configuring GCP:
    Do you want to configure GCP support in Skyplane? [Y/n]: n
    Disabling Google Cloud support

Config file saved to /root/.skyplane/config
To disable performance logging info: 
https://skyplane.org/en/latest/performance_stats_collection.html
```


> **💡 Hint** - If you run into any issues, please contact one of the Skyplane team members immediately. This step is critical to follow through the tutorial.

# Transferring Data with Skyplane

<p style="text-align:center;">
    <img src="./assets/unicast.jpg" width=700>
</p>

The core of Skyplane is based around the `cp` command. Suppose you want to transfer a fine-tuned [Gorilla](https://github.com/ShishirPatil/gorilla) model from one region to another to be accessible to a cross-regional serving cluster. Skyplane can help you efficiently transfer this data so you model weights are accessible accross multiple regions.  Let’s prepare for a transfer by first initializing buckets in a few different cloud regions in AWS.

# Creating Buckets

### Setting up AWS in the Destination region

First, let’s create a bucket in the destination region `aws:ap-south-1` to store the model weights. 

> **💡 Hint** - Reminder to replace [name] with a unique string. e.g., "edcvr"

In [None]:
bucket_name = "gorilla-weights-[name]"

We can create the bucket through Skyplane's API interface. 

In [None]:
import skyplane
client = skyplane.SkyplaneClient()

bucket_path = client.object_store().create_bucket(region="aws:ap-south-1", bucket_name=bucket_name)
bucket_path

# Exercise 1: Copying data between AWS regions

Transferring data between AWS regions with `aws s3 cp`

Each cloud provider has dedicated tools to move data between cloud regions. Let’s try transferring over the data using AWS’s built in cp command:

💻 `aws s3 cp --recursive s3://skycamp-demo-bucket/gorilla s3://{bucket_name}`

```
Completed 536.0 MiB/12.4 GiB (17.8 MiB/s) with 2 file(s) remaining
```

### This will take a long time to complete. Feel free to interrupt the command. Notice that it copies data at under 25 MiB/s.


# Exercise 2: Let's try the same transfer with Skyplane

In [None]:
src_bucket_path = "s3://skycamp-demo-bucket/gorilla/"

In [None]:
client.copy(src_bucket_path, bucket_path, recursive=True, max_instances=4)

##  💡  Observe skyplane significantly reduces the time to move data

# Exercise 3: Transferring to multiple destinations
<p style="text-align:center;">
    <img src="./assets/multicast.jpg" width=700>
</p>

In some cases, data needs to be replicated to multiple destinations. For example, say you have some freshly trained model weights: you'll want to have them accessible across multiple regions as quickly as possible. In this example, we'll show how you can run a multicast (i.e. multi-destination) transfer using Skyplane. 

## Create a secondary region bucket

Lets create a second bucket in the additional destination region `aws:eu-north-1`. 

> **💡 Hint** - Reminder to replace [name] with a unique string. e.g., "edcvr"

In [None]:
another_bucket_name = "gorilla-[name]"

In [None]:
another_bucket_path = client.object_store().create_bucket(region="aws:eu-north-1", bucket_name=another_bucket_name)
another_bucket_path

## Running a multicast transfer 
To run a multicast transfer, we can simply enter a list of destinations instead of a single destination. 

In [None]:
client.copy(src_bucket_path, [bucket_path, another_bucket_path], recursive=True, max_instances=4)

# Exercise 4: Cleanup 
Finally, lets use the Skyplane API to delete the buckets we created. 

In [None]:
client.object_store().delete_bucket(bucket_name, provider="aws")

In [None]:
client.object_store().delete_bucket(another_bucket_name, provider="aws")

## 💻 Terminate your cluster!

Finally, just to make sure that we don't have any instances running that might be burning up money, let's quickly deprovision everything.

💻 `skyplane deprovision`

```
No instances to deprovision
✓ Removing IPs from VPCs (4/4) in 2.05s

```

## 🎉 Congratulations! Your plane has now landed. Skyplane is an open sourced project. Feel free to use Skyplane for all your data mobility needs!


#### Eager to learn more? 

#### Feel free to play-around with the [Skyplane optimizer](https://optimizer.skyplane.org/), read our NSDI 2023 [paper](https://arxiv.org/abs/2210.07259), or browse through our GitHub [repository](https://github.com/skyplane-project/skyplane).

Acknowledgement: Thanks to [Skypilot](https://github.com/romilbhardwaj/skypilot-tutorial/) for the notebook template.