# Example Binder

This is a test binder. I used it to develop documentation on how to make binders so that the technology is more accessible to instructors and researchers everywhere.

I will be accessing Landsat 8 data under the AWS Public Dataset Program: https://registry.opendata.aws/landsat-8/

Launching the binder happens on AWS, so this should make it easier to load the data in cloud.

In [1]:
import numpy as np
#import matplotlib.pyplot as plt
#import scipy

#import xarray as xr
#import intake

from dask_kubernetes import KubeCluster
from dask.distributed import Client, LocalCluster
from dask.distributed import wait, progress

## Setup

Two things we want to set up before we can do any analysis, which are both usually huge time-sinks.

### Dask: Computational Power

Dask will allow us to spin up a Kubernetes cluster on our AWS instance. This cluster can scale how many workers it uses when you run commands, so you can have faster computations but not pay for that resource when you're not using it.

In [None]:
cluster = KubeCluster()
cluster.adapt(minimum=4, maximum=40)
cluster

In [None]:
client = Client(cluster)

### Intake: Data Ingestion

One big problem with big data is that it is too big for most conventional laptops (which is what most researchers have). Getting onto a High-Performance Computer (HPC) can take a lot of time and authentication, which delays research. Having data on publicly-available cloud platforms removes the stress of setting up access, and computing on the same platform removes most of the time to get to your data.

In [None]:
landsat_data = intake.open_catalog('arn:aws:s3:::landsat-pds')