# Jupyter and Pangeo - Interactive Computing and Data Science

In this short section, we'll provide a brief overview of the tools we'll be using from [Project Jupyter](https://jupyter.org).  We will present this live on the tutorial from a more interactive slide deck, but these notes may be useful as a reading reference.

In brief, Jupyter provides a variety of tools to help you explore your data and computational worfklows _interactively_, that is in a context where you use a programming language (Python in our case) to execute (typically small amounts of) code and immediately view the results, informing your next steps where new code is executed. This process, inherently iterative and exploratory, is a natural fit for scientific research and commonly available in many environments like Matlab, IDL, Mathematica and more. 

Jupyter brings a set of open standards and tools to this space: it supports over 100 programming languages, provides interfaces that let you use this workflow similarly on your laptop or on a remote system (like the cloud or a supercomputer), and has a rich ecosystem of related and supporting tools.

Later we'll dive deeper into the Jupyter tools, for now we're going to quickly cover enough of our two key ones to make sure you find your bearings in this tutorial, which focuses on Git and GitHub.

## JupyterHub and Pangeo

While you can install the Jupyter tools to run on your laptop, we will instead run them on a cloud computing environment that offers scalable computational resources as well as close proximity to the data we want to use. Instead of all of you having to download lots of data each, we can spin up virtual computers in the cloud pre-configured with all the same tools and packages you'd need locally, in close proximity (in a network sense) to the data.

[JupyterHub](https://jupyter.org/hub) is the tool in Jupyter that helps us do this: it is a multi-user server that lets us easily host Jupyter and other interactive data science tools (such as R Studio) under one "web roof".

<img src="images/jupyterhub-logo.svg" width="300px">

[Pangeo](https://pangeo.io) is a project that combines JupyterHub with:

* [Dask](https://dask.org): high-level distributed computing, giving us scalability.
* [xarray](https://xarray.pydata.org): numerical computing on n-dimensional labeled arrays (NetCDF model).

and it provides an effective community platform for big data (geo)science.

When you go to our pangeo hub at https://icesat-2.hackweek.io, you should see this:

<img src="images/pangeo-login.png" width="90%">

after logging in with your GitHub credentials, you should get:

<img src="images/pangeo-server.png" width="90%">

Once you click on the red "Start" button, something like this should come up:

<img src="images/jupyterlab-00.png" width="90%">

## Pause: anyone with login issues?

# Jupyter Lab Basics

JupyterLab is a locally- or cloud-based environment for interactive computing, that lets you create Jupyter Notebooks but has many more tools and capabilities.  We will dive into more detail in our next tutorial, but for now let's see a brief rundown of its structure so you can get oriented.

This is what JupyterLab looks like upon startup:

<img src="images/jupyterlab-01.png" width="90%">

And this is JupyterLab with a number of tools open:

<img src="images/jupyterlab-02.png" width="90%">

Now, we will grab our tutorial materials! They are located in [this GitHub repository](https://github.com/ICESAT-2HackWeek/intro-git).

First, open a terminal in JupyterLab, and in that terminal type:

```bash
git clone https://github.com/ICESAT-2HackWeek/intro-git.git
``` 

You should see something like the following, with the new folder containing the tutorial appearing a few seconds later in the left sidebar:

<img src="images/jupyterlab-git-clone.png" width="90%">