### 1/ Make sure you have activated the Python virtual environment that is used as the notebook kernel

In [1]:
!which python

/Users/kenly/Documents/Work/ISS-IS02PT/PRS-PM-ISY5002-GROUP5/SystemCode/.venv/sandbox/bin/python


### 2/ Install the package for google-cloud-storage

In [4]:
!pip install google-cloud-storage

Collecting google-cloud-storage
  Downloading google_cloud_storage-1.31.0-py2.py3-none-any.whl (88 kB)
[K     |████████████████████████████████| 88 kB 203 kB/s 
[?25hCollecting google-cloud-core<2.0dev,>=1.4.1
  Downloading google_cloud_core-1.4.1-py2.py3-none-any.whl (26 kB)
Collecting google-resumable-media<2.0dev,>=1.0.0
  Downloading google_resumable_media-1.0.0-py2.py3-none-any.whl (42 kB)
[K     |████████████████████████████████| 42 kB 791 kB/s 
[?25hCollecting google-auth<2.0dev,>=1.11.0
  Downloading google_auth-1.21.0-py2.py3-none-any.whl (92 kB)
[K     |████████████████████████████████| 92 kB 815 kB/s 
[?25hCollecting google-api-core<2.0.0dev,>=1.19.0
  Downloading google_api_core-1.22.1-py2.py3-none-any.whl (91 kB)
[K     |████████████████████████████████| 91 kB 1.9 MB/s 
Collecting google-crc32c<2.0dev,>=1.0; python_version >= "3.5"
  Downloading google-crc32c-1.0.0.tar.gz (10 kB)
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wh

### 3/ Assume we have a numpy object that we want to save its state. The steps will be as following:
- 3a: Serialize the object in Notebook to a binary file
- 3b: Upload that binary file to the Google Cloud Storage (Make sure you have the service account's private json key file ready, and please don't ever commit the json file to git)

In [2]:
import numpy as np
import pickle

# Assuming we have an array of evenly-spaced values
my_arr = np.arange(0,10,1)
my_arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [3]:
# Serialize and Export the object
with open('./my_arr.pkl', 'wb') as my_arr_pkl:
  pickle.dump(my_arr, my_arr_pkl)  

In [4]:

from datetime import datetime
from google.cloud import storage

# Explicitly use service account credentials by specifying the private key file.
storage_client = storage.Client.from_service_account_json('my-spark-iss-0cc3a9e9a54d.json')

### Upload the serialized object to Google Cloud Storage
                                        # Use the bucket_name as below for our project
bucket_name = "my-spark-iss-us-central1"                            
                                        # File name from above cell
source_file_name = "./my_arr.pkl"                                   
                                        # The destination blob name is appended with the current time to differentiate different versions
destination_blob_name = "ken/my_arr.pkl" + "." + datetime.now().strftime("%d-%b-%Y_%H:%M:%S") 

bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(destination_blob_name)

blob.upload_from_filename(source_file_name)
print("File {} uploaded to {}".format(source_file_name, destination_blob_name))

File ./my_arr.pkl uploaded to ken/my_arr.pkl.03-Sep-2020_22:45:48


### 4/ Restore the state of the object to use in subsequent processing. The steps will be as following:
- 4a: Download the binary file from Google Cloud Storage
- 4b: Recreate the object in Notebook from that binary file

In [5]:
from google.cloud import storage

# Explicitly use service account credentials by specifying the private key file.
storage_client = storage.Client.from_service_account_json('my-spark-iss-0cc3a9e9a54d.json')

# Make an authenticated API request
"""Downloads a blob from the bucket."""
bucket_name = "my-spark-iss-us-central1"
source_blob_name = "ken/my_arr.pkl.03-Sep-2020_22:45:48"        # From above GCS output filename
destination_file_name = "./my_arr.pkl"

bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(source_blob_name)
blob.download_to_filename(destination_file_name)

In [6]:
# Load the object 
with open('./my_arr.pkl', 'rb') as my_arr_pkl:
    my_arr = pickle.load(my_arr_pkl)

# We are getting back the same object
my_arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])