# Initial Setup For this Project

This is the notebook that sets up the GCP project for the other notebooks in the `project=statmike-mlops`.  Based on the [`Readme.md`](https://github.com/statmike/vertex-ai-mlops/blob/main/readme.md), you already have this repository of notebooks pulled as a local resource in your Vertex AI notebook instance.

---
## Parameters

In [1]:
REGION = 'us-central1'
PROJECT_ID = 'statmike-mlops'

BUCKET = PROJECT_ID

---
## Create Storage Bucket

In [2]:
from google.cloud import storage
gcs = storage.Client(project=PROJECT_ID)

In [3]:
bucketDef = gcs.bucket(BUCKET)
bucket = gcs.create_bucket(bucketDef, project=PROJECT_ID, location=REGION)
bucket

<Bucket: statmike-mlops>

---
## Store Project Data in the Storage Bucket

In [4]:
import pandas as pd
from sklearn import datasets
digits = datasets.load_digits()

digits_df = pd.DataFrame(data=digits.data)
digits_df['target'] = digits.target
digits_df['target_OE'] = digits_df['target'].apply(lambda x : 'Odd' if x%2==1 else ('Even' if x%2==0 else ''))
digits_df.columns = ['p'+str(i) if i <= 63 else x for i, x in enumerate(digits_df.columns,0)]

In [5]:
digits_df.to_csv('gs://'+PROJECT_ID+'/digits/data/digits.csv',index=False)

---
## Install AIPlatform Package:

It appears that the notebook instance does not have `google-cloud-aiplatform` installed already.  This will be needed in order to `import aiplatform` in notebooks using python clients for aiplatform (models, endpoints, jobs, prediction).
- get details of this here: https://cloud.google.com/ai-platform-unified/docs/start/client-libraries#client_libraries

In [6]:
!pip install google-cloud-aiplatform -U -q



---
## Install TFX

In [7]:
!pip install tfx -U -q

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
jupyterlab-git 0.11.0 requires nbdime<2.0.0,>=1.1.0, but you have nbdime 3.0.0 which is incompatible.[0m


In [8]:
!pip install tensorflow -U -q

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tfx 0.30.0 requires tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.5.*,<3,>=1.15.2, but you have tensorflow 2.5.0 which is incompatible.
tensorflow-transform 0.30.0 requires tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,<2.5,>=1.15.2, but you have tensorflow 2.5.0 which is incompatible.[0m


In [9]:
!pip install tensorflow-io -U -q