# Google Colaboratory Notebook Template
## Description:
This template provide an easy way for developer to integrate their google drive with colaboratory environment.
In addition, some useful package installaction tips are also included.
You can run each segment by your needs.


## Check the default process and you location on the server.

### If you encounter the issue of Out of Memory (OOM),  you can restart the instance by executing !kill -9 -1

In [1]:
!ps -ef
!pwd

# # unmark below line to restart the instance / colaboratory service
# !kill -9 -1 

UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 22:24 ?        00:00:00 /bin/bash -e /datalab/run.sh
root        70     1  0 22:24 ?        00:00:00 node /tools/node/bin/forever --m
root        80    70  0 22:24 ?        00:00:00 /tools/node/bin/node /datalab/we
root        90    80  1 22:24 ?        00:00:00 /usr/bin/python /usr/local/bin/j
root       111    90 58 22:26 ?        00:00:03 /usr/bin/python3 -m ipykernel_la
root       131   111 98 22:26 pts/0    00:00:00 /bin/sh -c ps -ef
root       132   131  0 22:26 pts/0    00:00:00 ps -ef
/content


## Step 1. Prepare google drive connection to leverage the GPU computing power

### Let's create a folder under google drive, say 'workspace'.

![Create folder](https://cheng-lin-li.github.io/images/2018-04-04/create_folder.png)

## Step 2. Create the Colaboratory notebook.
### Change your current folder to 'workspace' which you just create.
### Now it's time to create your Google Colaboratory by right click on the folder, then select 'Colaboratory'
### Or you can [download my google colaboratory template from here](https://cdn.rawgit.com/Cheng-Lin-Li/Cheng-Lin-Li.github.io/master/resources/2018-04-04/GoogleColaboratoryNotebookTemplate.ipynb) and upload the file to 'workspace' folder, then open it.

![Image of create folder](https://cheng-lin-li.github.io/images/2018-04-04/create_file.png)

## Step3. Enable the GPU
### follow Edit > Notebook settings>Change runtime type (or Runtime > Change runtime type) then select GPU as Hardware accelerator.

![Enable GPU](https://cheng-lin-li.github.io/images/2018-04-04/enable_gpu.png)

## Step4. Grant Google drive and content access privileges to Google Colaboratory server/instance.

In [0]:
# Download necessary software
!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse


# Generate auth tokens for Colab
from google.colab import auth
auth.authenticate_user()
# Generate creds for the Drive FUSE library.
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}

'apt-get' is not recognized as an internal or external command,
operable program or batch file.
The system cannot find the path specified.
The system cannot find the path specified.
'apt-get' is not recognized as an internal or external command,
operable program or batch file.


ModuleNotFoundError: No module named 'google.colab'

### Step 5. Change to workspace folder and copy necessary files from google drive to google Colab instance.

In [0]:
!mkdir -p drive
!google-drive-ocamlfuse -o nonempty drive
!pwd
!ls
!cd drive
!ls
import os
os.chdir("drive/workspace")
!ls
!cp -R * ../../
os.chdir("../../")
!ls -rlt

/content
datalab  drive
datalab  drive
campaign_data.csv    sample_submission_4fcZwvQ.csv  train.csv
Email_Predict.ipynb  test_BDIfz5B.csv
campaign_data.csv    sample_submission_4fcZwvQ.csv  train.csv
Email_Predict.ipynb  test_BDIfz5B.csv


### Step 6. Make sure GPU is ready

In [0]:
# Do we use gpu in tensorflow?
import tensorflow as tf
tf.test.gpu_device_name()

  from ._conv import register_converters as _register_converters


''

## Step 7. Install additional libraries and try to import your libraries.

### install requirements

In [1]:
!pip install -r ./requirements.txt

Could not open requirements file: [Errno 2] No such file or directory: './requirements.txt'
You are using pip version 10.0.0, however version 10.0.1 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.


### Include libraries

In [0]:
%matplotlib inline
import pickle, gzip
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
from ast import literal_eval
from sklearn.feature_extraction.text import TfidfVectorizer, CountVectorizer
from sklearn.metrics.pairwise import linear_kernel, cosine_similarity
from nltk.stem.snowball import SnowballStemmer
from nltk.stem.wordnet import WordNetLemmatizer
from nltk.corpus import wordnet

from nltk.classify import NaiveBayesClassifier
from nltk.corpus import subjectivity
from nltk.sentiment import SentimentAnalyzer
from nltk.sentiment.util import *
import re
import nltk
from nltk.corpus import stopwords

import tensorflow as tf

import warnings; warnings.simplefilter('ignore')

### Step 7-1. Example. Download additional nltk stop words

In [0]:
# nltk.download('stopwords')
stopWords = set(stopwords.words('english'))
print(len(stopWords))
print(stopWords)

179
{'after', "isn't", 'above', 'what', 'during', 'below', 'yourself', 'so', 'but', "shouldn't", 'off', 'by', 'about', 'hasn', 'hers', 'd', "haven't", 'with', 'not', 'from', 'shan', 'that', 'o', 'into', 'at', 'of', 'too', 'ourselves', 'over', 'on', 'through', 'your', 'are', 'same', 'weren', 'she', 'these', "don't", 'aren', "it's", 'shouldn', 'as', 'up', 'is', 'does', 's', 'were', "should've", 'such', 'between', 'no', 'under', "wasn't", 'which', "mightn't", 'haven', 'it', 'themselves', 'how', 'i', 'ain', 'the', 'where', "you'll", 'mustn', 'some', "she's", "didn't", 'my', 'being', 'me', 'isn', 'in', 'for', 'more', 'ma', 'been', 'll', 'didn', "you're", 'than', "doesn't", 'yours', 'did', 'y', 'hadn', "hasn't", 'his', 'this', 'only', 'be', 'doing', "shan't", 'why', 'very', 'before', 'ours', 'few', 'm', "aren't", 'most', 'any', 'couldn', 'himself', 't', 'you', 'once', 'nor', 'other', 'will', 'a', "couldn't", 'them', "hadn't", 'have', "mustn't", 'because', 'we', 'had', 'myself', 'or', 'hersel

### Step 8. Save your data into Colab instance then copy the file to your google drive.
#### 8-1. save a Keras model file into Colab instance then copy the model file to google drive.

In [0]:
model.save("cnn_model.h5")
!cp cnn_model* ./drive/workspace

In [0]:
model.save("cnn_lstm_model.h5")
!cp cnn_lstm_model* ./drive/workspace

In [0]:
model = load_model("cnn_model.h5")

In [0]:
model = load_model("cnn_lstm_model.h5")

### 8-2. Save Python objects into a file then copy to google drive.

In [0]:
pickle.dump((list_test_userlist_wordlist_index, word_index), gzip.open("email_words_test.pkl", 'wb'))
!cp email_words_test* ./drive/NLP

In [0]:
(list_test_userlist_wordlist_index, word_index) = pickle.load(gzip.open("email_words_test.pkl", 'rb'))

## Step 9. Backup your results to Google Drive
### Assume your data and models stored in ./data folder
### You want to sync. everything under ./data folder to google drive.

In [None]:
!ls
!cp -R ./data/* ./drive/workspace/data