# 00 - Environment Setup

This is the notebook that sets up the GCP project for the other notebooks in the `project=statmike-mlops`.  Based on the [`Readme.md`](https://github.com/statmike/vertex-ai-mlops/blob/main/readme.md), you already have this repository of notebooks pulled as a local resource in your Vertex AI notebook instance.

---
## Vertex AI - Conceptual Flow

<img src="architectures/slides/slide_03.png">

---
## Vertex AI - Workflow

<img src="architectures/slides/slide_04.png">

---
## Setup

inputs:

In [24]:
REGION = 'us-central1'
PROJECT_ID = 'statmike-mlops'
DATANAME = 'digits'

derived inputs:

In [25]:
BUCKET = PROJECT_ID

packages:

In [26]:
from google.cloud import storage

import pandas as pd
from sklearn import datasets

---
## Create Storage Bucket

In [27]:
gcs = storage.Client(project=PROJECT_ID)

In [28]:
bucketDef = gcs.bucket(BUCKET)
bucket = gcs.create_bucket(bucketDef, project=PROJECT_ID, location=REGION)
bucket

<Bucket: statmike-mlops>

---
## Store Project Data in the Storage Bucket

In [29]:
source = datasets.load_digits()

source_df = pd.DataFrame(data=source.data)
source_df['target'] = source.target
source_df['target_OE'] = source_df['target'].apply(lambda x : 'Odd' if x%2==1 else ('Even' if x%2==0 else ''))
source_df.columns = ['p'+str(i) if i <= 63 else x for i, x in enumerate(digits_df.columns,0)]

source_df.to_csv(f"gs://{BUCKET}/{DATANAME}/data/{DATANAME}.csv", index=False)

---
## Update AIPlatform Package:

The `google-cloud-aiplatform` package update frequently.  Update it for latest functionality.

In [1]:
!pip install google-cloud-aiplatform -U -q

---
## Install KFP
If you get an error after a step, rerun it.  The dependecies sometimes resolve.

In [3]:
!pip install kfp -q

In [4]:
!pip install google-cloud-pipeline-components -U -q

---
## Other For Specific Notebooks

06 - Plotly used for visualizations

In [1]:
!pip install plotly -q

07 - Test version of aiplatform client loaded for featurestore while in preview phase

In [3]:
!pip install --upgrade git+https://github.com/googleapis/python-aiplatform.git@main-test -q

---
## OTHER - Obsolete?

In [2]:
!pip install tensorflow-io -q

In [33]:
!pip install tfx-bsl -U -q

In [34]:
!pip install tensorflow -U -q

In [36]:
!pip install tfx -U -q

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow-io 0.21.0 requires tensorflow<2.7.0,>=2.6.0, but you have tensorflow 2.5.1 which is incompatible.
jupyterlab-git 0.11.0 requires nbdime<2.0.0,>=1.1.0, but you have nbdime 3.1.0 which is incompatible.[0m


In [37]:
!pip install google-cloud-aiplatform -U -q

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tfx 1.2.0 requires google-cloud-aiplatform<0.8,>=0.5.0, but you have google-cloud-aiplatform 1.4.2 which is incompatible.[0m
