____

* **Day 1**: Determining what information should be monitored with a dashboard. [Notebook](https://www.kaggle.com/rtatman/dashboarding-with-notebooks-day-1), [Livestream Recording](https://www.youtube.com/watch?v=QO2ihJS2QLM)
* **Day 2**: How to create effective dashboards in notebooks, [Python Notebook](https://www.kaggle.com/rtatman/dashboarding-with-notebooks-day-2-python), [R Notebook](https://www.kaggle.com/rtatman/dashboarding-with-notebooks-day-2-r), [Livestream](https://www.youtube.com/watch?v=rhi_nexCUMI)
* **Day 3**: Running notebooks with the Kaggle API, [Notebook](https://www.kaggle.com/rtatman/dashboarding-with-notebooks-day-3), [Livestream](https://youtu.be/cdEUEe2scNo)
* **Day 4**: Scheduling notebook runs using cloud services, [Notebook](https://www.kaggle.com/rtatman/dashboarding-with-notebooks-day-4), [Livestream](https://youtu.be/Oujj6nT7etY)
* **Day 5**: Testing and validation, [Python Notebook](https://www.kaggle.com/rtatman/dashboarding-with-notebooks-day-5), [R Notebook](https://www.kaggle.com/rtatman/dashboarding-with-notebooks-day-5-r), [Livestream](https://www.youtube.com/watch?v=H6DcpIykT8E)

____


Welcome to the third day of Dashboarding with scheduled notebooks. Today we're going to do two things:

* Get your Kaggle credentials set up in a cloud service (either GCP or PythonAnywhere)
* Run a kernel from the shell provided by that service

Today's timeline: 

* **5 minutes:** Read notebook & pick service to use
* **5 minutes:** Make account & sign up
* **5 minutes:** Get your credentials set up
* **5 minutes:** Run kernel from cloud service


# The Kaggle API

Today we’re going to be using the Kaggle API to run our notebooks. The full documentation for the API is [on GitHub](https://github.com/Kaggle/kaggle-api); today I’ll just be covering how to  download and upload notebooks.

> **What’s an API?** An API is an application program interface. It lets you interact with a program or website by using a programming language rather than a graphic interface like the front end of a website. 

The most common sticking point when starting with the API is in setting up your credentials. Your credentials are what allow you to interact with Kaggle using your account. If you don’t have them set up just right, the API will throw an error when you try to run commands. Since this is a common source of problems, I’ll be going over how to set them up in excruciating detail today (that’s why the notebook is so long!). 

We're also going to be using Bash today to set up our credentials. If you're not familiar with Bash or it's just been a while, my colleague Alexis has put together [a helpful getting started guide](https://www.kaggle.com/alexisbcook/intro-to-unix-commands ) to help you get up to speed.

____

# Which cloud service should you pick?

I’ve picked two services to walk you through using. Here’s a quick run-down of some pros and cons of each to help you decide which one to use. You’re also free to use any other cloud service, these are just the ones I’ll be talking about.

### GCP

[GCP](https://cloud.google.com) is Google’s cloud platform. It’s made up of a large number of services, but for this event we’ll only be using four:

* [Cloud Shell](https://cloud.google.com/shell/docs/), a shell environment for managing resources hosted on Google Cloud Platform. This is the only one we’ll be using today; the other three are going to come into play tomorrow.
* [Cloud Functions](https://cloud.google.com/functions/docs/concepts/overview), which allow you to write single purpose functions that you can automatically run whenever a trigger happens. I’ll be showing you how to write functions in Python, but you can also use JavaScript if you prefer. We’re going to be using these functions to push and then pull our notebooks.
* [Cloud Scheduler](https://cloud.google.com/scheduler/), a fairly new service that lets you schedule and run cron jobs in the cloud. (We’ll talk about cron jobs tomorrow, if you’re not familiar with them.) We’ll use Cloud Scheduler to trigger our Cloud Functions.
* [Cloud Storage](https://cloud.google.com/storage/getting-started/), which, as the name suggests, lets you store things, specifically objects or blobs. We’ll be using it to store notebooks and notebook metadata in. 

There are advantages and disadvantages to using GCP for this project. The main advantage is that, once it’s set up, you can easily scale up your work and integrate with other GCP products, like the Speech API for audio transcription or the Maps API for getting shape files.

The main disadvantage is that, to schedule jobs, you need to set up a billing account, which requires a credit card. For this project, you shouldn’t need more resources than are available in the [free tier](https://cloud.google.com/free) unless you really go to town but it is possible you may be charged. Beyond the issue of cost, I do know that not everyone has access to a credit card, however. If you don’t, I’d recommend using PythonAnywhere instead.

#### [Instructions for using GCP here](#GCP-Instructions)

___

### PythonAnywhere

[PythonAnywhere](https://www.pythonanywhere.com/) is an in-browser Python coding environment. I picked it because I find it to be fairly user friendly and because it lets you schedule scripts to run as cron jobs.

> **IMPORTANT NOTE:** PythonAnywhere will have scheduled downtime for a systems upgrade on Wednesday 19th December 2018 (2018-12-19) at 07:00 AM UTC. They expect approximately 20 minutes of downtime. (I only found out about this Tuesday or I would have planned around it! 😅)

The main advantage of PythonAnywhere is that you don’t need a credit card to schedule a script to run. The main disadvantage (besides the really unfortunate timing on that downtime) is that PythonAnywhere doesn’t have as many features as GCP. It’s mainly designed for running web apps or websites, so if you’re not working in that domain you may end up needing to migrate to a different service that offers more features.

#### [Instructions for using PythonAnywhere here](#PythonAnywhere-Instructions)

____

# GCP Instructions

Here are step-by-step instructions on how to use the Kaggle API to run kernels from the Cloud Shell. 

Note that any place where I’ve written something in all caps, like “YOUR KAGGLE USERNAME HERE”, you’ll need to replace that text with your actual Kaggle username. 

### Open Cloud Shell

1. Sign into your Google Account (if you don’t have a Google account, [create one first](https://support.google.com/accounts/answer/27441?hl=en)).
2. Go to https://cloud.google.com/
3. Click "Go To Console". This will take you to your GCP dashboard.
4. Click on the square button with an arrow and dash in the top right hand corner that says "Activate Cloud Shell" when you hover over it.
6. You should see a black shell open at the bottom of your screen. Click in it to begin typing.
7. (Optional: You can check out the readme to learn more about the shell by running `cat README-cloudshell.txt`)

### Install the Kaggle API

1. Install Kaggle by running `sudo pip install kaggle`

### Set up your credentials

1. Go to your Kaggle account page at `https://www.kaggle.com/[YOUR KAGGLE USERNAME HERE]/account`.
2. Scroll down to the API section and click "Create New API Token". This will download a .json file with your Kaggle credentials.
3. Go back to the Cloud Shell in your GCP account.
4. Upload your credentials by clicking on the three dots in the menu in the header of the shell and then clicking on the "Upload File" command. Follow the prompts to upload the .json file with your credentials. They should be uploaded to the directory `/home/[YOUR GCP USERNAME HERE]`, which is the directory your session starts in by default.
5. (Optional: Run the command `kaggle`. This will throw an error because you haven't moved your credentials to the correct directory yet. The final line of the error will tell you the path to put your credentials in.)
6. Move your credentials to the .kaggle directory by running `mv kaggle.json /home/[YOUR GCP USERNAME HERE]/.kaggle/kaggle.json'
7. Make your credentials private by running `chmod 600 /home/[YOUR GCP USERNAME HERE]/.kaggle/kaggle.json`
8. Check that you did everything correctly by running the command `kaggle -h`. This should bring up the API help menu.

### Pull your notebook

> To "pull" a notebook means to download a local copy of it. 

1. (Optional: you can search for your kernels, ranked by how recently you ran them, by running `kaggle kernels list --user [YOUR KAGGLE USERNAME HERE] --sort-by dateRun`.)
2. Pull a copy of your kernel by running kaggle `kaggle kernels pull [AUTHOR'S KAGGLE USERNAME]/[KERNEL SLUG FROM URL] -m`. For example, if you wanted to pull a copy of [this kernel](https://www.kaggle.com/rtatman/world-bank-open-calls-dashboard), you would run `kaggle kernels pull rtatman/world-bank-open-calls-dashboard -m`.
3. Check that you pulled it correctly by running `ls`.  You should see that the notebook file and metadata file are both in your current working directory.

### Push your notebook

> To "push" a notebook means to upload a copy to Kaggle. When you push your notebook, it is automatically committed and creates a new version. Since committing a notebook runs all the code from top to bottom, this may take a while if you have a very computation-heavy notebook.

1. Check to make sure that you have both the .ipynb and `kernel-metadata.json` files in your current working directory by running `ls`. If you don't see them, either move to the directory where you downloaded them or pull a fresh copy of notebook.
2. Push your notebook by running `kaggle kernels push`.

And that's all you need to do to update your Kaggle Kernels from the command line on GCP!

____

# PythonAnywhere Instructions

Here are step-by-step instructions on how to use the Kaggle API to run kernels from the PythonAnywhere Bash Console.

Note that any place where I’ve written something in all caps, like “YOUR KAGGLE USERNAME HERE”, you’ll need to replace that text with your actual Kaggle username. 

### Create a Bash Console

1. Log in to your PythonAnywhere account. If you don’t have one yet, you can [sign up here](https://www.pythonanywhere.com/registration/register/beginner/). 
2. Create a Bash console by going to your account home (it will be at `https://www.pythonanywhere.com/user/[YOUR PYTHONANYWHERE USERNAME]`) and clicking on the `$ Bash` button under “New console:”. You should see a black shell with a green dollar sign.  

### Install the Kaggle API

1. Run the command `pip install kaggle --user`

### Set up your credentials

1. Go to your Kaggle account page at `https://www.kaggle.com/[YOUR KAGGLE USERNAME HERE]/account`.
2. Scroll down to the API section and click "Create New API Token". This will download a .json file with your Kaggle credentials.
3. Go to your file upload page at `https://www.pythonanywhere.com/user/YOUR PYTHONANYWHERE USERNAME/files/home/YOUR PYTHONANYWHERE USERNAME`. 
4. Click on the yellow “Upload a file” button and follow the instructions to upload your `kaggle.json` file.
5. Go back to your PythonAnywhere shell. 
6. Check that your file has been uploaded by running `ls`. You should see `kaggle.json` listed. 
7. Make a .kaggle directory and then move the .json files with your credentials to it by running the command `mkdir --parents /home/YOUR PYTHONANYWHERE USERNAME/.kaggle; mv kaggle.json`.
8. Make your credentials private by running `chmod 600 /home/[YOUR GCP USERNAME HERE]/.kaggle/kaggle.json`
9. Check that you did everything correctly by running the command `kaggle -h`. This should bring up the API help menu.

### Add proxy information to your credentials

This step will allow your Python Anywhere account internet access in order to connect to Kaggle via the API. 

1. Go to the .kaggle directory by running `cd /home/YOUR_USERNAME/.kaggle`. 
2. Open the nano editor to edit your credentials by running `nano kaggle.json`.
3. Add the line `"proxy": "http://proxy.server:3128"` to the end of your .json file. The final file should look something like this: 

```
{"username":"YOUR KAGGLE USERNAME", 
"key":"YOUR KAGGLE KEY", 
"proxy": "http://proxy.server:3128"}
```

4. Hit CTRL + O and then enter to save your changes.
5. Hit CTRL + X to exit the editor. 
6. (Optional: Print out your file to make sure it looks correct by running `cat kaggle.json`.)
7. Navigate back to the root directory by running `cd ~`. 

### Pull your notebook

> To "pull" a notebook means to download a local copy of it. 

1. (Optional: you can search for your kernels, ranked by how recently you ran them, by running `kaggle kernels list --user [YOUR KAGGLE USERNAME HERE] --sort-by dateRun`.)
2. Pull a copy of your kernel by running kaggle `kaggle kernels pull [AUTHOR'S KAGGLE USERNAME]/[KERNEL SLUG FROM URL] -m`. For example, if you wanted to pull a copy of [this kernel](https://www.kaggle.com/rtatman/world-bank-open-calls-dashboard), you would run `kaggle kernels pull rtatman/world-bank-open-calls-dashboard -m`.
3. Check that you pulled it correctly by running `ls`.  You should see that the notebook file and metadata file are both in your current working directory.

### Push your notebook

> To "push" a notebook means to upload a copy to Kaggle. When you push your notebook, it is automatically committed and creates a new version. Since committing a notebook runs all the code from top to bottom, this may take a while if you have a very computation-heavy notebook.

1. Check to make sure that you have both the .ipynb and `kernel-metadata.json` files in your current working directory by running `ls`. If you don't see them, either move to the directory where you downloaded them or pull a fresh copy of notebook.
2. Push your notebook by running `kaggle kernels push`.

And that’s all you need to do to run your notebooks on PythonAnywhere!

# Your turn!

Today’s a fairly self-explanatory one; pick a service and follow the directions to get your credentials set up and run your notebook using the API. :)

