[View in Colaboratory](https://colab.research.google.com/github/jessewei/data-integration/blob/master/Colab_Notebook_Introduction.ipynb)

# Notebook Functionality

Like an improved(?) Jupyter Notebook.

1. Text Cells with [Markdown](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) formatting
2. Code Cells
3. Notebook stores code, output, and execution order
4. Tab and Tab + Tab Autocomplete
5. IPython Help Features
6. IPython Magics (`%%`)

## Additional Features

- collaborative editing
- history 
- executed code history
- Shift+click multiple cell selection
- searchable code snipetts + table of contents
- scratchpad (⌘/Ctrl + Alt + N)
- Stack Overflow button

## Keyboard Shortcuts
| Command | Action |
| ---- | ----: |
|⌘/Ctrl+Enter | Run Selected Cell |
|Shift+Enter| Run Cell and Select Next |
|Alt+Enter| Run cell and insert new cell|
|⌘/Ctrl+M I | Interrupt Execution |

## Command Palette

![](http://hopelessoptimism.com/static/images/screenshots/colab/command_palette.png)

# How to...

## Create a new notebook

### With Python 2

![](http://hopelessoptimism.com/static/images/screenshots/colab/python2_colab.png)

### With Python 3

![](http://hopelessoptimism.com/static/images/screenshots/colab/python3_colab.png)

### with GPU Support
![](http://hopelessoptimism.com/static/images/screenshots/colab/gpu_colab.png)
![](http://hopelessoptimism.com/static/images/screenshots/colab/gpu2_colab.png)

## Import existing code/notebooks

## Installing Libraries

In [0]:
# https://keras.io/
!pip install -q keras
import keras

Using TensorFlow backend.


## Getting Help
- `??` (+ IPython Magics)
- tutorial notebooks
- stack overflow
- built in snippets + help

In [0]:
import pandas as pd

In [0]:
# show method signature and documentation

pd.DataFrame?

In [0]:
# show method source

??pd.DataFrame

# Running your First Network

In [0]:
from keras.layers import Input, Dense
from keras.models import Model

# This returns a tensor
inputs = Input(shape=(784,))


In [0]:
print(inputs)
print(' ')

Tensor("input_3:0", shape=(?, 784), dtype=float32)
 


In [0]:
layer = Dense(64, activation='relu')
print(' ')

 


In [0]:
x = layer(inputs)

print(x)
print(' ')

Tensor("dense_7_3/Relu:0", shape=(?, 64), dtype=float32)
 


In [0]:
'''Trains a simple convnet on the MNIST dataset.
Gets to 99.25% test accuracy after 12 epochs
(there is still a lot of margin for parameter tuning).
16 seconds per epoch on a GRID K520 GPU.
'''

from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K


In [0]:
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import SGD

# Generate dummy data
import numpy as np
x_train = np.random.random((1000, 20))
y_train = keras.utils.to_categorical(np.random.randint(10, size=(1000, 1)), num_classes=10)
x_test = np.random.random((100, 20))
y_test = keras.utils.to_categorical(np.random.randint(10, size=(100, 1)), num_classes=10)


In [0]:
y_train

array([[0., 0., 0., ..., 0., 1., 0.],
       [1., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 1., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 1., 0., ..., 0., 0., 0.]])

In [0]:
x_train

array([[0.14521926, 0.91954634, 0.04472922, ..., 0.40739888, 0.80800318,
        0.63560801],
       [0.55136025, 0.53602939, 0.30710589, ..., 0.18645107, 0.92911576,
        0.37086158],
       [0.34834615, 0.74986819, 0.27974187, ..., 0.74839222, 0.66662491,
        0.61150642],
       ...,
       [0.27381187, 0.07577917, 0.09821743, ..., 0.62359175, 0.95171429,
        0.8204232 ],
       [0.70034861, 0.82853168, 0.83870522, ..., 0.41366793, 0.72248952,
        0.92379898],
       [0.13853576, 0.89885876, 0.35824904, ..., 0.35132361, 0.3532724 ,
        0.69383763]])

In [0]:

model = Sequential()
# Dense(64) is a fully-connected layer with 64 hidden units.
# in the first layer, you must specify the expected input data shape:
# here, 20-dimensional vectors.
model.add(Dense(64, activation='relu', input_dim=20))
# 20 -> 64
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
# 64 -> 64
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
# 64 -> 10

In [0]:
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)

model.compile(loss='categorical_crossentropy',
              optimizer=sgd,
              metrics=['accuracy'])

In [0]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_8 (Dense)              (None, 64)                1344      
_________________________________________________________________
dropout_1 (Dropout)          (None, 64)                0         
_________________________________________________________________
dense_9 (Dense)              (None, 64)                4160      
_________________________________________________________________
dropout_2 (Dropout)          (None, 64)                0         
_________________________________________________________________
dense_10 (Dense)             (None, 10)                650       
Total params: 6,154
Trainable params: 6,154
Non-trainable params: 0
_________________________________________________________________


In [0]:
model.fit(x_train, y_train,
          epochs=20,
          batch_size=128)



Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x7fd0dd89a320>

In [0]:
score = model.evaluate(x_test, y_test, batch_size=128)



In [0]:
print(score)
print(' ')

[2.30511474609375, 0.1599999964237213]
 


In [0]:
batch_size = 128
num_classes = 10
epochs = 12

# input image dimensions
img_rows, img_cols = 28, 28

# the data, shuffled and split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

# only train on the first 1000 examples
x_train, y_train = x_train[:1000], y_train[:1000]

# only test on the first 100 examples
x_test, y_test = x_test[:100], y_test[:100]

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz
x_train shape: (1000, 28, 28, 1)
1000 train samples
100 test samples


In [0]:
%%time

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))

score = model.evaluate(x_test, y_test, verbose=0)

Train on 1000 samples, validate on 100 samples
Epoch 1/12
Epoch 2/12
Epoch 3/12
Epoch 4/12
Epoch 5/12
Epoch 6/12
Epoch 7/12
Epoch 8/12
Epoch 9/12
Epoch 10/12
Epoch 11/12
Epoch 12/12
Test loss: 0.1105849027633667
Test accuracy: 0.95
CPU times: user 3.09 s, sys: 549 ms, total: 3.64 s
Wall time: 3.84 s


In [0]:
print('Test loss:', score[0])
print('Test accuracy:', score[1])
print(' ')
print(' ')

Test loss: 0.1105849027633667
Test accuracy: 0.95
 
 


In [0]:
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import Embedding
from keras.layers import LSTM

model = Sequential()
model.add(Embedding(max_features, output_dim=256))
model.add(LSTM(128))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

model.fit(x_train, y_train, batch_size=16, epochs=10)
score = model.evaluate(x_test, y_test, batch_size=16)

NameError: ignored

In [0]:
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import Embedding
from keras.layers import Conv1D, GlobalAveragePooling1D, MaxPooling1D

model = Sequential()
model.add(Conv1D(64, 3, activation='relu', input_shape=100))
model.add(Conv1D(64, 3, activation='relu'))
model.add(MaxPooling1D(3))
model.add(Conv1D(128, 3, activation='relu'))
model.add(Conv1D(128, 3, activation='relu'))
model.add(GlobalAveragePooling1D())
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

NameError: ignored

In [0]:
model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

model.fit(x_train, y_train, batch_size=16, epochs=10)
score = model.evaluate(x_test, y_test, batch_size=16)

## Checking Devices/Hardware

* Limited to 10 hours of continuous runtime?
* Memory limits?

In [0]:
import tensorflow as tf
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))

SystemError: ignored

In [0]:
import psutil
import os

In [0]:
psutil.cpu_percent()

2.7

In [0]:
values = psutil.virtual_memory()
values

svmem(total=13662167040, available=9911345152, percent=27.5, used=11091972096, free=2570194944, active=5469667328, inactive=4326985728, buffers=657301504, cached=6683848704, shared=369168384)

In [0]:
print("Virtual Memory in MB: {}".format(values.total >> 20))
print("Virtual Memory in GB: {}".format(values.total >> 30))
print("Using {}% of total memory".format(values.percent))

Virtual Memory in MB: 13029
Virtual Memory in GB: 12
Using 27.5% of total memory


In [0]:
# get filesystem statistics of current directory
statvfs = os.statvfs('.')

print("There are {} GB of disk space in total".format((statvfs.f_frsize * statvfs.f_blocks) >> 30))
print("There are {} GB of disk space free".format((statvfs.f_frsize * statvfs.f_bavail) >> 30))

There are 365 GB of disk space in total
There are 358 GB of disk space free


In [0]:
from tensorflow.python.client import device_lib

device_lib.list_local_devices()

[name: "/device:CPU:0"
 device_type: "CPU"
 memory_limit: 268435456
 locality {
 }
 incarnation: 6023683497922980489, name: "/device:GPU:0"
 device_type: "GPU"
 memory_limit: 263979008
 locality {
   bus_id: 1
 }
 incarnation: 5854057392093976711
 physical_device_desc: "device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7"]

In [0]:
# https://pypi.python.org/pypi/pydot
!apt-get -qq install -y graphviz && pip install -q pydot
import pydot

In [0]:
# To determine which version you're using:
!pip show tensorflow

# For the current version: 
!pip install --upgrade tensorflow

# For a specific version:
!pip install tensorflow==1.2

# For the latest nightly build:
!pip install tf-nightly

Name: tensorflow
Version: 1.6.0rc1
Summary: TensorFlow helps the tensors flow
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: opensource@google.com
License: Apache 2.0
Location: /usr/local/lib/python3.6/dist-packages
Requires: gast, wheel, astor, six, tensorboard, grpcio, protobuf, absl-py, numpy, termcolor
Requirement already up-to-date: tensorflow in /usr/local/lib/python3.6/dist-packages
Collecting numpy>=1.13.3 (from tensorflow)
  Downloading numpy-1.14.1-cp36-cp36m-manylinux1_x86_64.whl (12.2MB)
[K    100% |████████████████████████████████| 12.2MB 108kB/s 
[?25hRequirement already up-to-date: gast>=0.2.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow)
Requirement already up-to-date: grpcio>=1.8.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow)
Requirement already up-to-date: six>=1.10.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow)
Requirement already up-to-date: termcolor>=1.1.0 in /usr/local/lib/python3.6/d

Collecting markdown==2.2.0 (from tensorflow==1.2)
  Downloading Markdown-2.2.0.tar.gz (236kB)
[K    100% |████████████████████████████████| 245kB 4.0MB/s 
Collecting backports.weakref==1.0rc1 (from tensorflow==1.2)
  Downloading backports.weakref-1.0rc1-py3-none-any.whl
Building wheels for collected packages: markdown
  Running setup.py bdist_wheel for markdown ... [?25l- \ | / - done
[?25h  Stored in directory: /content/.cache/pip/wheels/b9/4f/6c/f4c1c5207c1d0eeaaf7005f7f736620c6ded6617c9d9b94096
Successfully built markdown
Installing collected packages: markdown, backports.weakref, tensorflow
  Found existing installation: Markdown 2.6.11
    Uninstalling Markdown-2.6.11:
      Successfully uninstalled Markdown-2.6.11
  Found existing installation: tensorflow 1.6.0rc1
    Uninstalling tensorflow-1.6.0rc1:
      Successfully uninstalled tensorflow-1.6.0rc1
Successfully installed backports.weakref-1.0rc1 markdown-2.2.0 tensorflow-1.2.0
Collecting tf-nightly
  Downloading t

Collecting markdown>=2.6.8 (from tb-nightly<1.8.0a0,>=1.7.0a0->tf-nightly)
  Downloading Markdown-2.6.11-py2.py3-none-any.whl (78kB)
[K    100% |████████████████████████████████| 81kB 9.9MB/s 
[?25hInstalling collected packages: markdown, tb-nightly, tf-nightly
  Found existing installation: Markdown 2.2.0
    Uninstalling Markdown-2.2.0:
      Successfully uninstalled Markdown-2.2.0
Successfully installed markdown-2.6.11 tb-nightly-1.7.0a20180222 tf-nightly-1.7.0.dev20180222


# Linux/Unix commands

One downside (depending on how you look at it) to only having the notebook interface is that you do not have any access to a terminal environment or the underlying operating system of the virtual machine this notebook is running on. We can get around this limitation using the `!` operator in IPython. Any line prepended with `!` will run what follows as a shell command (like if you were running it in a terminal).

In [0]:
# who am I?
!whoami

root


In [0]:
# where am I?
!pwd

/content


In [0]:
# what is here?
!ls

datalab


## Downloading/Uploading Files

For documentation on all the options see the [Google Notebook](https://colab.research.google.com/notebook#fileId=/v2/external/notebooks/io.ipynb&scrollTo=BaCkyg5CV5jF)

In addition to what is listed in the Google notebook docs, you can interact with files on the web using Unix/Linux commands 👇

In [0]:
# get a Github repository
!wget https://github.com/tensorflow/magenta/archive/master.zip

--2018-02-03 00:44:36--  https://github.com/tensorflow/magenta/archive/master.zip
Resolving github.com (github.com)... 192.30.253.112, 192.30.253.113
Connecting to github.com (github.com)|192.30.253.112|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://codeload.github.com/tensorflow/magenta/zip/master [following]
--2018-02-03 00:44:37--  https://codeload.github.com/tensorflow/magenta/zip/master
Resolving codeload.github.com (codeload.github.com)... 192.30.253.121, 192.30.253.120
Connecting to codeload.github.com (codeload.github.com)|192.30.253.121|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/zip]
Saving to: ‘master.zip’

master.zip              [   <=>              ]  13.28M  18.1MB/s    in 0.7s    

2018-02-03 00:44:38 (18.1 MB/s) - ‘master.zip’ saved [13929640]



In [0]:
# it worked!
!ls

datalab  magenta.zip  master.zip


In [0]:
# now we need to unzip it
!unzip master.zip

Archive:  master.zip
e6597d7918d4b374dba407ca91654a7d8f884fbb
   creating: magenta-master/
 extracting: magenta-master/.gitignore  
 extracting: magenta-master/.gitmodules  
  inflating: magenta-master/AUTHORS  
  inflating: magenta-master/LICENSE  
  inflating: magenta-master/README.md  
  inflating: magenta-master/WORKSPACE  
   creating: magenta-master/demos/
  inflating: magenta-master/demos/README.md  
   creating: magenta-master/kokoro/
   creating: magenta-master/kokoro/gcp_ubuntu/
   creating: magenta-master/kokoro/gcp_ubuntu/py2/
  inflating: magenta-master/kokoro/gcp_ubuntu/py2/presubmit.sh  
   creating: magenta-master/kokoro/gcp_ubuntu/py3/
  inflating: magenta-master/kokoro/gcp_ubuntu/py3/presubmit.sh  
  inflating: magenta-master/kokoro/test.sh  
  inflating: magenta-master/magenta-logo-bg.png  
   creating: magenta-master/magenta/
  inflating: magenta-master/magenta/BUILD  
  inflating: magenta-master/magenta/__init__.py  
   creating: magenta-maste


  inflating: magenta-master/magenta/reviews/assets/gan/image01.png  
  inflating: magenta-master/magenta/reviews/assets/gan/image02.png  
  inflating: magenta-master/magenta/reviews/assets/gan/image03.png  
  inflating: magenta-master/magenta/reviews/assets/gan/image04.png  
  inflating: magenta-master/magenta/reviews/assets/gan/image05.png  
  inflating: magenta-master/magenta/reviews/assets/gan/image06.png  
  inflating: magenta-master/magenta/reviews/assets/gan/image07.png  
  inflating: magenta-master/magenta/reviews/assets/gan/image08.png  
  inflating: magenta-master/magenta/reviews/assets/gan/image09.png  
  inflating: magenta-master/magenta/reviews/assets/gan/image10.png  
  inflating: magenta-master/magenta/reviews/assets/gan/image11.png  
  inflating: magenta-master/magenta/reviews/assets/gan/image12.png  
  inflating: magenta-master/magenta/reviews/assets/gan/image13.png  
  inflating: magenta-master/magenta/reviews/assets/gan/image14.png  
  inflating: magent

In [0]:
!ls

datalab  magenta-master  magenta.zip  master.zip


In [0]:
# get rid of the ZIP
!rm master.zip

In [0]:
from google.colab import files

In [0]:
files.upload()

KeyboardInterrupt: ignored

In [0]:
files.download()

MessageError: ignored