In [0]:
import pandas as pd
from google.colab import files

# Uploading Files

First, to read in files we want to upload them to our virtual session.  There are few ways we can do that.  We can upload them from our local machine :

In [0]:
# It's good practice to first remove all files from the session, as if there are
# files with the same name already there, it won't overwrite them, but create
# a copy with another name
%rm -r *

# Enable user to upload one or more files
uploaded_files = files.upload()

# We'll store a list of filenames in the session
list_of_filenames = [filename for filename, data in uploaded_files.items()]

# Print the files in the session
print ("Files in session : ")
%ls

Or we can grab a file from a GitHub repository.  Here we grab a csv file and read it into a Pandas DataFrame very easily.  NOTE - you MUST use the RAW url for the csv file (the url that appears when you click the RAW button on the csv file in GitHub) :

In [0]:
github_url = 'https://raw.githubusercontent.com/MichaelAllen1966/thursday_coding_club_2020/master/data/iris.csv'

df_git = pd.read_csv(github_url)

print (df_git)

# Downloading Files

Files in the session (those uploaded or created in your code) will be destroyed when the session is closed (or after 12 hours, whichever occurs first).  So it's important to get our files out of the session.  We can download them locally :

In [0]:
for filename in list_of_filenames:
  files.download(filename)

Or we can push the file(s) to a GitHub repository :

In [0]:
# Set up our github credentials (email and username)
# (replace with your credentials)
!git config --global user.email 'd.chalk@exeter.ac.uk'
!git config --global user.name 'celluauto'

# Import and use getpass to allow us to easily get the user's password from a 
# prompt
from getpass import getpass
password = getpass('Password:')

# Clone the repository we want to which we want to push our files
# (replace with your username and repo name)
# This will create a folder in the session with the same name as your repo
!git clone https://celluauto:$password@github.com/celluauto/colab_test

# Copy the file(s) that we want to push to our new folder
%cp minutes_per_day_ca.csv colab_test/

# Change to the new folder
%cd colab_test

# Add the file(s) we want to push
!git add minutes_per_day_ca.csv

# Commit and specify message
!git commit -m 'This is my commit from colab'

# Push to your github repository
!git push origin master

# Come back out of the folder and remove the folder.  If you don't do this,
# you'd create a new folder within a folder every time you run the clone command
%cd ..
%rm -r colab_test

# Importing a GitHub hosted Notebook into CoLab

We can REALLY easily load in a GitHub hosted Jupyter Notebook directly into CoLab by simply replacing 'github.com/' in the url of the notebook with 'colab.research.google.com/github/'.  As demonstrated in the following example link (double click to see the syntax) :

[Open Mike's Iris Classification Notebook in CoLab!](https://colab.research.google.com/github/MichaelAllen1966/thursday_coding_club_2020/blob/master/04_iris_classification.ipynb)

# UI Code Snippets

Clicking on the <> icon on the left brings up some useful CoLab code snippets, particularly in relation to useful UI features.  Have an explore, but here are a few examples :

In [0]:
dan_approval_rate = 19 #@param {type: "slider", min: 0, max: 100}
""" IMPORTANT - YOU MUST RE-RUN THE CELL AFTER CHOOSING THE VALUE
YOU WANT OR THE VARIABLE WILL NOT UPDATE """

In [0]:
print ("Dan's Current Approval Rating is : ", dan_approval_rate)

In [0]:
select_best_penchordian = "Sean" #@param ['Dan', 'Mike', 'Tom', 'Sean', 'Kerry', 'Andy', 'Alison', 'Lucy', 'Martin', 'Sarah']
""" IMPORTANT - YOU MUST RE-RUN THE CELL AFTER CHOOSING THE VALUE
YOU WANT OR THE VARIABLE WILL NOT UPDATE """

In [0]:
print ("Who's the best PenCHORDian...?")

if select_best_penchordian == "Dan":
  print ("Yes that's right, it's Dan")
else:
  print ("No it's not ", select_best_penchordian, ", it's Dan", sep="")

# GPU vs CPU

Let's show off the power of using a GPU vs CPU for Neural Network tasks.  We'll test this using a demo, which I've taken (and very slightly modified) from https://colab.research.google.com/notebooks/gpu.ipynb#scrollTo=v3fE7KmKRDsH.

First, before you run the following code, you need to change the runtime to use the GPU.  Click Edit -> Notebook Settings, and then select GPU in the dropdown list.  Once you've done that, run the next cell, which checks that you have successfully connected to a GPU.

In [0]:
%tensorflow_version 2.x
import tensorflow as tf
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))

Assuming that worked ok, we'll now run the demo code.  This demo generates a random image, and then constructs a convolutional neural network layer over that image.  The construction of the neural network layer is tested on both the GPU and the CPU, and timeit is used to compare the time taken to build the layer on each.  The operation is run 100 times on each, and the total time across the runs on the GPU vs the CPU is reported :

In [0]:
%tensorflow_version 2.x
import tensorflow as tf
import timeit

device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  print(
      '\n\nThis error most likely means that this notebook is not '
      'configured to use a GPU.  Change this in Notebook Settings via the '
      'command palette (cmd/ctrl-shift-P) or the Edit menu.\n\n')
  raise SystemError('GPU device not found')

def cpu():
  with tf.device('/cpu:0'):
    random_image_cpu = tf.random.normal((100, 100, 100, 3))
    net_cpu = tf.keras.layers.Conv2D(32, 7)(random_image_cpu)
    return tf.math.reduce_sum(net_cpu)

def gpu():
  with tf.device('/device:GPU:0'):
    random_image_gpu = tf.random.normal((100, 100, 100, 3))
    net_gpu = tf.keras.layers.Conv2D(32, 7)(random_image_gpu)
    return tf.math.reduce_sum(net_gpu)
  
# We run each op once to warm up; see: https://stackoverflow.com/a/45067900
cpu()
gpu()

# Run the op several times.
print('Time (s) to convolve 32x7x7x3 filter over random 100x100x100x3 images '
      '(batch x height x width x channel). Sum of 100 runs.')
print('CPU (s):')
cpu_time = timeit.timeit('cpu()', number=100, setup="from __main__ import cpu")
print(cpu_time)
print('GPU (s):')
gpu_time = timeit.timeit('gpu()', number=100, setup="from __main__ import gpu")
print(gpu_time)
print('GPU speedup over CPU: {}x'.format(int(cpu_time/gpu_time)))

You can also use TPUs (Tensor Processing Units) in CoLab - you can see an example of their use here : https://colab.research.google.com/notebooks/tpu.ipynb#scrollTo=7Qv8rC4aVOFB

# Other things to check out

1. You can change to a dark mode theme in Tools -> Settings
2. Insert -> Scratch Code Cell allows you to insert a temporary cell that you can use to play around and test things.
3. If you have a GPU with CUDA cores, or you're not using the GPU but have a good CPU setup locally, you can choose to connect to a local runtime.  Simply click the drop down arrow next to the RAM and Disk readout on the top right, select "Connect to Local Runtime" and follow the instructions.
4. By default, your notebook can only be accessed by those who have the link and have been specifically invited by you.  By clicking 'Share' in the top right you can change this, so that anyone with the link can access it.  You can also choose the level of access people have :
- Viewer (can view but not comment or amend it)
- Commenter (can view and add comments but not amend it)
- Editor (can change it)
Typically, I'd recommend allowing anyone with the link to access it, but at 'Viewer' level.  If you have Viewer access to a notebook, you can still play around with it, or even amend it, by either :
a) taking a copy of the notebook and saving it in your own drive or GitHub repository (File -> Save a copy...)
b) enter playground mode (a button should appear on the top right allowing you to do this when you're a viewer of a notebook).  This allows you to play around to your heart's content with a temporary copy of the notebook, that's destroyed once your session ends.
Try this out on the End of Life Care Model, available here : https://colab.research.google.com/drive/1K0-xsqG_uIpoWhw86tjOUM55OcFfhq_2