# How to build Google Cloud SDK in Python

###### Author: Mohamed Niang, Data Scientist at FeetMe

# Requirements

**1. From Anacounda Cmd Prompt, install the required libraries from the following commands**

``` Python
pip install PyMySQL
pip install google-cloud-storage
```

**2. Configure the config.py file in your current working folder and save the file**

``` Python
"""Google Cloud Storage Configuration."""

from os import environ

import os


# Google Cloud Storage
bucketURL = environ.get('GCP_BUCKET_URL')

bucketName = environ.get('GCP_BUCKET_NAME')

bucketFolder = environ.get('GCP_BUCKET_FOLDER_NAME')

# Local folder to download Data
localFolder = os.path.abspath(os.getcwd())
```

# Import library

In [1]:
# Pandas library
import pandas as pd

# Cloud sql library
import pymysql

# Cloud storage library
import os
from os import listdir
from os.path import isfile, join
from google.cloud import storage
from config import bucketName, localFolder, bucketFolder

**Important:** <div style="text-align: justify"> To create an sql cloud connection from python, it is first necessary to configure the proxy to authenticate to sql cloud. That is, you should allow python to access the Google Cloud on behalf of your application, using a set of Google credentials. This is a separate process from authenticating database users. </div>

# Create Proxy for Cloud Sql Authentication

<div style="text-align: justify"> To do this, you first need to install the sql cloud proxy from the following link: </div>

-  https://dl.google.com/cloudsql/cloud_sql_proxy_x64.exe

<div style="text-align: justify"> Then select Save Link As, save the file in the windows system 32 folder. Rename the file to cloud_sql_proxy.exe. </div>

- **C:\Windows\System32**

<div style="text-align: justify"> After that, open your windows prompt in administrator mode and run the following code:
</div>

- **cloud_sql_proxy -instances=myProject:us-central1:myInstance=tcp:3306**

# Cloud SQL Connexion

<div style="text-align: justify"> To connect to a cloud database via python, you would need to have the connection parameters. These parameters can be obtained from the user who created the project in the cloud. These connection parameters are the following:
</div>

- user
- password
- db

**Important: The google cloud sql hostname is : '127.0.0.1'.**

In [2]:
# Define cloud sql connexion
connection = pymysql.connect(host='127.0.0.1',
                                 user='user',
                                 password='password',
                                 db='database')

In [3]:
with connection: 
    cur = connection.cursor()
    cur.execute("SELECT * FROM path_data")

In [4]:
print(cur.description)

(('path_to_movie', 253, None, 1020, 1020, 0, True), ('recordID', 253, None, 1020, 1020, 0, False))


In [5]:
data = cur.fetchall()

In [6]:
df = pd.DataFrame(list(data), columns = ['path_to_movie', 'recordID'])

In [7]:
# Setting recordID as index column 
df.set_index("recordID", inplace = True) 

In [8]:
display(df)

Unnamed: 0_level_0,path_to_movie
recordID,Unnamed: 1_level_1
3rrfUBx2yNisxcTY8n6o,VID_20190319_151442.mp4


In [9]:
# Test recordID
recordID_vid_to_download = df.loc['3rrfUBx2yNisxcTY8n6o']['path_to_movie']

In [10]:
print(recordID_vid_to_download)

VID_20190319_151442.mp4


# Cloud Storage Connexion

<div style="text-align: justify"> To connect to google cloud storage via python, you need to provide the service account credentials and simply set the environment variable.
</div>

**1. After creating a service account, you have two choices for providing the credentials to your application. You can either set the environment variable GOOGLE_APPLICATION_CREDENTIALS explicitly, or you can specify the path to the service account key in the code.**

**2. Provide authentication credentials to your application code by setting the environment variable GOOGLE_APPLICATION_CREDENTIALS. Replace [PATH] with the path to the JSON file containing your service account key and [FILE_NAME] with the filename. This variable applies only to the current shell session. Therefore, if you open a new session, you must set it again.**

In [11]:
# Set GOOGLE_APPLICATION_CREDENTIALS env var in python
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="file.json"

In [12]:
print(os.environ['GOOGLE_APPLICATION_CREDENTIALS']) 

FeetMe-Mobility-Dev-e90a29c38323.json


In [13]:
# Initialise a client
storage_client = storage.Client()

In [14]:
# Create a bucket object for our bucket
bucket = storage_client.get_bucket('bucket_name')

In [15]:
# Listing Files
def list_files(bucketName):
    """List all files in GCP bucket."""
    files = bucket.list_blobs(prefix=bucketFolder)
    fileList = [file.name for file in files if '.' in file.name]
    return fileList

In [16]:
# Listing Files
storage_files = list_files('bucket_name')

In [17]:
print(storage_files)

['VID_20190319_151442.mp4']


In [18]:
# Function to Downloading Files
def download_file(bucketName, recordID_vid_to_download):
    """Download files from GCP bucket."""
    blob = bucket.blob(recordID_vid_to_download)
    fileName = blob.name.split('/')[0]
    blob.download_to_filename(fileName)
    return f'{fileName} downloaded from bucket.'

In [19]:
# Download Files
download_file(bucketName, recordID_vid_to_download)

'VID_20190319_151442.mp4 downloaded from bucket.'