In [None]:
# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Vision Workshop - Environment Setup

## Overview

[Vision Workshop](https://github.com/mblanc/vision-workshop) is a series of labs on how to build an image classification system on Google Cloud. Throughout the Vision Workshop labs, you will learn how to read image data stored in data lake, perform exploratory data analysis (EDA), train a model, register your model in a model registry, evaluate your model, deploy your model to an endpoint, do real-time inference on your model.

### Objective

Before you run this notebook, make sure that you have completed the steps in [README](README.md).

In this notebook, you will setup your environment for Fraudfinder to be used in subsequent labs.

This lab uses the following Google Cloud services and resources:

- [Vertex AI](https://cloud.google.com/vertex-ai/)
- [Google Cloud Storage](https://cloud.google.com/storage)

Steps performed in this notebook:

- Setup your environment.
- Load image data into Cloud Storage.
- Read data from Cloud Storage.

### Costs

This tutorial uses billable components of Google Cloud:

* Vertex AI
* Cloud Storage

Learn about [Vertex AI
pricing](https://cloud.google.com/vertex-ai/pricing), [Cloud Storage
pricing](https://cloud.google.com/storage/pricing)
to generate a cost estimate based on your projected usage.

### Install additional packages

Install the following packages required to execute this notebook.

In [1]:
import os

# The Vertex AI Workbench Notebook product has specific requirements
IS_WORKBENCH_NOTEBOOK = os.getenv("DL_ANACONDA_HOME")
IS_USER_MANAGED_WORKBENCH_NOTEBOOK = os.path.exists(
    "/opt/deeplearning/metadata/env_version"
)

# Vertex AI Notebook requires dependencies to be installed with '--user'
USER_FLAG = ""
if IS_WORKBENCH_NOTEBOOK:
    USER_FLAG = "--user"

!pip install --upgrade --no-warn-conflicts '{USER_FLAG}' -q \
    google-cloud-pubsub==2.13.6 \
    google-api-core==2.8.2 \
    google-apitools==0.5.32 \
    plotly==5.10.0 \
    itables==1.2.0 \
    apache_beam==2.40.0 \
    google-cloud-pipeline-components \
    kfp \
    tensorflow==2.8.3 \
    tensorflow_datasets \
    tensorflow_hub

[0m

After you install the additional packages, you need to restart the notebook kernel so it can find the packages.

In [None]:
# Automatically restart kernel after installs
import os

if not os.getenv("IS_TESTING"):
    import IPython

    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)

### Setup your environment

Run the next cells to import libraries used in this notebook and configure some options.

Run the next cell to set your project ID and some of the other constants used in the lab.  

In [None]:
import random
import string

# Generate unique ID to help w/ unique naming of certain pieces
ID = "".join(random.choices(string.ascii_lowercase + string.digits, k=5))

GCP_PROJECTS = !gcloud config get-value project
PROJECT_ID = GCP_PROJECTS[0]
BUCKET_NAME = f"{PROJECT_ID}-vision-workshop"
REGION = "europe-west4"

### Create a Google Cloud Storage bucket and save the config data.

Next, we will create a Google Cloud Storage bucket and will save the config data in this bucket. After the cell operation finishes, you can navigate to [Google Cloud Storage](https://console.cloud.google.com/storage/) to see the GCS bucket. 

In [None]:
config = f"""
BUCKET_NAME          = \"{BUCKET_NAME}\"
PROJECT              = \"{PROJECT_ID}\"
REGION               = \"{REGION}\"
ID                   = \"{ID}\"
MODEL_NAME           = \"vision_workshop_model\"
ENDPOINT_NAME        = \"vision_workshop_endpoint\"
"""

!gsutil mb -l {REGION} gs://{BUCKET_NAME}

!echo '{config}' | gsutil cp - gs://{BUCKET_NAME}/config/notebook_env.py

### Copy the data into Google Cloud Storage

Now we will copy the data and ingest it into Google Cloud Storage.

In [None]:
!gsutil -m cp -r gs://cloud-samples-data/ai-platform/flowers/flowers_200_folders.zip .


In [18]:
!gsutil -m ls gs://cloud-samples-data/ai-platform/cifar_custom

gs://cloud-samples-data/ai-platform/cifar_custom/hpt_cifar.tar.gz
gs://cloud-samples-data/ai-platform/cifar_custom/tf2_trainer_cifar.tar.gz
gs://cloud-samples-data/ai-platform/cifar_custom/trainer_cifar.tar.gz
gs://cloud-samples-data/ai-platform/cifar_custom/tuner_cifar.tar.gz


In [16]:
!gsutil -m cp gs://cloud-samples-data/ai-platform/flowers/flowers.csv .

Copying gs://cloud-samples-data/ai-platform/flowers/flowers.csv...
/ [1/1 files][314.3 KiB/314.3 KiB] 100% Done                                    
Operation completed over 1 objects/314.3 KiB.                                    


In [10]:
!rm -rf prod

In [11]:
!mkdir prod

In [12]:
!unzip flowers_200_unlabeled.zip -d prod

Archive:  flowers_200_unlabeled.zip
   creating: prod/flowers_200_unlabeled/
  inflating: prod/flowers_200_unlabeled/10090824183_d02c613f10_m.jpg  
  inflating: prod/flowers_200_unlabeled/1031799732_e7f4008c03.jpg  
  inflating: prod/flowers_200_unlabeled/10466558316_a7198b87e2.jpg  
  inflating: prod/flowers_200_unlabeled/10555749515_13a12a026e.jpg  
  inflating: prod/flowers_200_unlabeled/11023277956_8980d53169_m.jpg  
  inflating: prod/flowers_200_unlabeled/110472418_87b6a3aa98_m.jpg  
  inflating: prod/flowers_200_unlabeled/1193386857_3ae53574f2_m.jpg  
  inflating: prod/flowers_200_unlabeled/12025038686_7f10811d4b_n.jpg  
  inflating: prod/flowers_200_unlabeled/12193032636_b50ae7db35_n.jpg  
  inflating: prod/flowers_200_unlabeled/12338444334_72fcc2fc58_m.jpg  
  inflating: prod/flowers_200_unlabeled/1244774242_25a20d99a9.jpg  
  inflating: prod/flowers_200_unlabeled/12916135413_dafcf3089e_n.jpg  
  inflating: prod/flowers_200_unlabeled/12998979765_3de89e7195_n.jpg  
  inflating: 

In [15]:
!ls -lrth prod/* | wc -l

201


In [None]:
!mkdir flowers

In [None]:
!unzip flowers_200_folders.zip -d flowers

In [None]:
!gsutil -m cp -r flowers gs://{BUCKET_NAME}/

In [None]:
BUCKET_NAME

### Check data in Google Cloud Storage

After ingesting our data into GCS, it's time to visualize some images to inspect the data.

In [None]:
!gsutil ls -R gs://{BUCKET_NAME}/

### END

Now you can go to the next notebook `01_exploratory_data_analysis.ipynb`