In [None]:
# Copyright 2021 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Vertex AI Feature Store with BigQuery

---
---
## 1. Environment Setup </font>
---
### 1.1 Set up your local development environment

**If you are using Colab or Google Cloud Notebooks**, your environment already meets
all the requirements to run this notebook. You can skip this step.

**Otherwise**, make sure your environment meets this notebook's requirements.
You need the following:

* The Google Cloud SDK
* Git
* Python 3
* virtualenv
* Jupyter notebook running in a virtual environment with Python 3

The Google Cloud guide to [Setting up a Python development
environment](https://cloud.google.com/python/setup) and the [Jupyter
installation guide](https://jupyter.org/install) provide detailed instructions
for meeting these requirements. The following steps provide a condensed set of
instructions:

1. [Install and initialize the Cloud SDK.](https://cloud.google.com/sdk/docs/)

1. [Install Python 3.](https://cloud.google.com/python/setup#installing_python)

1. [Install
   virtualenv](https://cloud.google.com/python/setup#installing_and_using_virtualenv)
   and create a virtual environment that uses Python 3. Activate the virtual environment.

1. To install Jupyter, run `pip install jupyter` on the
command-line in a terminal shell.

1. To launch Jupyter, run `jupyter notebook` on the command-line in a terminal shell.

1. Open this notebook in the Jupyter Notebook Dashboard.

---
### 1.2 Install additional packages

In [11]:
import sys

if "google.colab" in sys.modules:
    USER_FLAG = ""
else:
    USER_FLAG = "--user"

Run the following cell if you haven't installed the latest packages. 

In [None]:
#!pip3 install {USER_FLAG} google-cloud-aiplatform --upgrade
#!pip3 install {USER_FLAG} kfp google-cloud-pipeline-components --upgrade
#!pip3 install explainable-ai-sdk

---
### 1.3 Restart the kernel

**Only after** you install the additional packages, you need to restart the notebook kernel so it can find the packages.

In [1]:
# Automatically restart kernel after installs
import os

if not os.getenv("IS_TESTING"):
    # Automatically restart kernel after installs
    import IPython

    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)

Check the versions of the packages you installed.  The KFP SDK version should be >=1.6.

In [3]:
!python3 -c "import kfp; print('KFP SDK version: {}'.format(kfp.__version__))"

KFP SDK version: 1.8.12


### 1.4 Set up your Google Cloud project

**The following steps are required, regardless of your notebook environment.**

1. [Select or create a Google Cloud project](https://console.cloud.google.com/cloud-resource-manager). When you first create an account, you get a $300 free credit towards your compute/storage costs.

1. [Make sure that billing is enabled for your project](https://cloud.google.com/billing/docs/how-to/modify-project).

1. [Enable the Vertex AI, Cloud Storage, and Compute Engine APIs](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com,compute_component,storage-component.googleapis.com). 

1. Follow the "**Configuring your project**" instructions from the Vertex Pipelines documentation.

1. If you are running this notebook locally, you will need to install the [Cloud SDK](https://cloud.google.com/sdk).

1. Enter your project ID in the cell below. Then run the cell to make sure the
Cloud SDK uses the right project for all the commands in this notebook.#### Set your project ID

#### Set your project ID

**If you don't know your project ID**, you may be able to get your project ID using `gcloud`. You can get the project ID running the next cell. 

In [1]:
import os

PROJECT_ID = ""

# Get your Google Cloud project ID from gcloud
if not os.getenv("IS_TESTING"):
    shell_output=!gcloud config list --format 'value(core.project)' 2>/dev/null
    PROJECT_ID = shell_output[0]
    print("Project ID: ", PROJECT_ID)

Project ID:  erwinh-demo-joonix


Otherwise, set your project ID here.

In [3]:
if PROJECT_ID == "" or PROJECT_ID is None:
    PROJECT_ID = "erwinh-experimental"  # @param {type:"string"} update with your project id. 

#### Timestamp

If you are in a live tutorial session, you might be using a shared test account or project. To avoid name collisions between users on resources created, you create a timestamp for each instance session, and append it onto the name of resources you create in this tutorial.

In [4]:
from datetime import datetime

TIMESTAMP = datetime.now().strftime("%Y%m%d%H%M%S")

---
### 1.5 Authenticate your Google Cloud account

**If you are using Google Cloud Notebooks**, your environment is already
authenticated. Skip this step.

**If you are using Colab**, run the cell below and follow the instructions
when prompted to authenticate your account via oAuth.

**Otherwise**, follow these steps:

1. In the Cloud Console, go to the [**Create service account key**
   page](https://console.cloud.google.com/apis/credentials/serviceaccountkey).

2. Click **Create service account**.

3. In the **Service account name** field, enter a name, and
   click **Create**.

4. In the **Grant this service account access to project** section, click the **Role** drop-down list. Type "Vertex AI"
into the filter box, and select
   **Vertex AI Administrator**. Type "Storage Object Admin" into the filter box, and select **Storage Object Admin**.

5. Click *Create*. A JSON file that contains your key downloads to your
local environment.

6. Enter the path to your service account key as the
`GOOGLE_APPLICATION_CREDENTIALS` variable in the cell below and run the cell.

In [13]:
#import os
#import sys

# If you are running this notebook in Colab, run this cell and follow the
# instructions to authenticate your GCP account. This provides access to your
# Cloud Storage bucket and lets you submit training jobs and prediction
# requests.

# If on Google Cloud Notebooks, then don't execute this code
#if not os.path.exists("/opt/deeplearning/metadata/env_version"):
#    if "google.colab" in sys.modules:
#        from google.colab import auth as google_auth
#
#        google_auth.authenticate_user()
#
#    # If you are running this notebook locally, replace the string below with the
#    # path to your service account key and run this cell to authenticate your GCP
#    # account.
#    elif not os.getenv("IS_TESTING"):
#        %env GOOGLE_APPLICATION_CREDENTIALS ''

---
### 1.6 Create a Cloud Storage bucket as necessary

You will need a Cloud Storage bucket for this example.  If you don't have one that you want to use, you can make one now.


Set the name of your Cloud Storage bucket below. It must be unique across all
Cloud Storage buckets.

In [7]:
BUCKET_NAME = "gs://erwinh-demo-joonix/features" 
REGION = "us-central1"  
if BUCKET_NAME == "" or BUCKET_NAME is None or BUCKET_NAME == "gs://[your-bucket-name]":
    BUCKET_NAME = "gs://" + PROJECT_ID + "aip-" + TIMESTAMP

**Only if your bucket doesn't already exist**: Run the following cell to create your Cloud Storage bucket.

In [8]:
#! gsutil mb -l $REGION $BUCKET_NAME

Finally, validate access to your Cloud Storage bucket by examining its contents:

In [9]:
! gsutil ls -al $BUCKET_NAME

         0  2022-08-15T08:16:16Z  gs://erwinh-demo-joonix/features/#1660551376326851  metageneration=1
TOTAL: 1 objects, 0 bytes (0 B)


### 1.7 Import libraries and define constants

In [10]:
PATH=%env PATH
%env PATH={PATH}:/home/jupyter/.local/bin

USER = "erwinh"  # <---CHANGE THIS 
PIPELINE_ROOT = "{}/pipeline_root/{}".format(BUCKET_NAME, USER)

WORKING_DIR = f"{PIPELINE_ROOT}/{TIMESTAMP}"

MODEL_DISPLAY_NAME = f"train_deploy{TIMESTAMP}"
print(WORKING_DIR, MODEL_DISPLAY_NAME)

env: PATH=/opt/conda/bin:/opt/conda/condabin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/home/jupyter/.local/bin
gs://erwinh-demo-joonix/features/pipeline_root/erwinh/20220815081456 train_deploy20220815081456


In [11]:
import uuid
import numpy as np
import logging
import os
import tensorflow as tf
import pandas as pd
import kfp
import google.auth
import explainable_ai_sdk

from google.cloud import aiplatform
from google.cloud import bigquery
from google.cloud import bigquery_storage
from google.cloud.aiplatform import datasets

from google_cloud_pipeline_components import aiplatform as gcc_aip

from google.cloud.aiplatform_v1beta1 import (
    FeaturestoreOnlineServingServiceClient, FeaturestoreServiceClient)
from google.cloud.aiplatform_v1beta1.types import FeatureSelector, IdMatcher
from google.cloud.aiplatform_v1beta1.types import \
    entity_type as entity_type_pb2
from google.cloud.aiplatform_v1beta1.types import feature as feature_pb2
from google.cloud.aiplatform_v1beta1.types import \
    featurestore as featurestore_pb2
from google.cloud.aiplatform_v1beta1.types import \
    featurestore_monitoring as featurestore_monitoring_pb2
from google.cloud.aiplatform_v1beta1.types import \
    featurestore_online_service as featurestore_online_service_pb2
from google.cloud.aiplatform_v1beta1.types import \
    featurestore_service as featurestore_service_pb2
from google.cloud.aiplatform_v1beta1.types import io as io_pb2
from google.protobuf.duration_pb2 import Duration

from kfp.v2 import compiler
from kfp.v2.dsl import component
from kfp.v2.google import experimental
from kfp.v2.google.client import AIPlatformClient
from kfp.v2.dsl import ClassificationMetrics, Metrics, Output, Input, component, Model, Dataset, Artifact, Condition

from explainable_ai_sdk.metadata.tf.v2 import SavedModelMetadataBuilder

2022-08-15 08:16:52.462861: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-08-15 08:16:52.462931: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


In [12]:
API_ENDPOINT = "us-central1-aiplatform.googleapis.com"

---
---
## 2. Feature Store Creation
---
### 2.1 Prepare training dataset
1. Download sample dataset from [Kaggle](https://www.kaggle.com/ashishkumarsingh123/telecom-churn-dataset).
2. [Create a new dataset](https://cloud.google.com/bigquery/docs/samples/bigquery-load-table-gcs-csv) in your GCP BQ instance, and upload the downloaded data to BQ as a new table. e.g.: bq://sample-project.ml_sample.telecom_churn.

---
### 2.2 Create Feature store

Now it's time to create a [feature store](https://cloud.google.com/vertex-ai/docs/featurestore). The method to create a featurestore returns a long-running operation (LRO). An LRO starts an asynchronous job. LROs are returned for other API methods too, such as updating or deleting a featurestore. Calling create_lro.result() waits for the LRO to complete.

In [13]:
# Create admin_client for CRUD and data_client for reading feature values.
admin_client = FeaturestoreServiceClient(client_options={"api_endpoint": API_ENDPOINT})
data_client = FeaturestoreOnlineServingServiceClient(
    client_options={"api_endpoint": API_ENDPOINT} )

In [14]:
# Represents featurestore resource path.
BASE_RESOURCE_PATH = admin_client.common_location_path(PROJECT_ID, REGION)
FEATURESTORE_ID_TO_CREATE = "telecom_churn_{timestamp}".format(timestamp=TIMESTAMP)

create_lro = admin_client.create_featurestore(
    featurestore_service_pb2.CreateFeaturestoreRequest(
        parent=BASE_RESOURCE_PATH,
        featurestore_id=FEATURESTORE_ID_TO_CREATE,
        featurestore=featurestore_pb2.Featurestore(
            #display_name="Featurestore for telco churn prediction",
            online_serving_config=featurestore_pb2.Featurestore.OnlineServingConfig(
                fixed_node_count=1 # we do have the option to auto-scale. 
            ),
        ),
    )
)
# Wait for LRO to finish and get the LRO result.
print(create_lro.result())

name: "projects/429963084013/locations/us-central1/featurestores/telecom_churn_20220815081456"



You can use GetFeaturestore or ListFeaturestores to check if the Featurestore was successfully created. The following example gets the details of the Featurestore. You can also navigate to the [Google Cloud Console](https://console.cloud.google.com/vertex-ai/features) to see if the feature store was created successfully.

In [17]:
admin_client.get_featurestore(
    name=admin_client.featurestore_path(PROJECT_ID, REGION, FEATURESTORE_ID_TO_CREATE)
)

name: "projects/429963084013/locations/us-central1/featurestores/telecom_churn_20220815081456"
create_time {
  seconds: 1660551437
  nanos: 536694000
}
update_time {
  seconds: 1660551437
  nanos: 609398000
}
etag: "AMEw9yNyjT2BvcMbrIgviTtYMkan7D-8C8YLG964v640LQmPwK5figr2DGEooZm62DdV"
online_serving_config {
  fixed_node_count: 1
}
state: STABLE

---
### 2.3 Create Entity Type
You can specify a monitoring config which will by default be inherited by all Features under this EntityType.

In [18]:
# Create users entity type with monitoring enabled.
# All Features belonging to this EntityType will by default inherit the monitoring config.
users_entity_type_lro = admin_client.create_entity_type(
    featurestore_service_pb2.CreateEntityTypeRequest(
        parent=admin_client.featurestore_path(PROJECT_ID, REGION, FEATURESTORE_ID_TO_CREATE),
        entity_type_id="users",
        entity_type=entity_type_pb2.EntityType(
            description="Users entity",
            monitoring_config=featurestore_monitoring_pb2.FeaturestoreMonitoringConfig(
                snapshot_analysis=featurestore_monitoring_pb2.FeaturestoreMonitoringConfig.SnapshotAnalysis(
                    monitoring_interval=Duration(seconds=86400),  # 1 day
                ),
            ),
        ),
    )
)

# Similarly, wait for EntityType creation operation.
print(users_entity_type_lro.result())

name: "projects/429963084013/locations/us-central1/featurestores/telecom_churn_20220815081456/entityTypes/users"



In [None]:
# Create features for the 'users' entity.
admin_client.batch_create_features(
    parent=admin_client.entity_type_path(PROJECT_ID, REGION, FEATURESTORE_ID_TO_CREATE, "users"),
    requests=[
        featurestore_service_pb2.CreateFeatureRequest(
            feature=feature_pb2.Feature(
                value_type=feature_pb2.Feature.ValueType.INT64,
                description="mobile_number",
                monitoring_config=featurestore_monitoring_pb2.FeaturestoreMonitoringConfig(
                    snapshot_analysis=featurestore_monitoring_pb2.FeaturestoreMonitoringConfig.SnapshotAnalysis(
                        disabled=False,
                    ),
                ),
            ),
            feature_id="mobile_number",
        ),
        featurestore_service_pb2.CreateFeatureRequest(
            feature=feature_pb2.Feature(
                value_type=feature_pb2.Feature.ValueType.DOUBLE,
                description="average revenue per user on first month",
                monitoring_config=featurestore_monitoring_pb2.FeaturestoreMonitoringConfig(
                    snapshot_analysis=featurestore_monitoring_pb2.FeaturestoreMonitoringConfig.SnapshotAnalysis(
                        disabled=False,
                    ),
                ),
            ),
            feature_id="arpu_m1",
        ),
        featurestore_service_pb2.CreateFeatureRequest(
            feature=feature_pb2.Feature(
                value_type=feature_pb2.Feature.ValueType.DOUBLE,
                description="average revenue per user on second month",
                monitoring_config=featurestore_monitoring_pb2.FeaturestoreMonitoringConfig(
                    snapshot_analysis=featurestore_monitoring_pb2.FeaturestoreMonitoringConfig.SnapshotAnalysis(
                        disabled=False,
                    ),
                ),
            ),
            feature_id="arpu_m2",
        ),
        featurestore_service_pb2.CreateFeatureRequest(
            feature=feature_pb2.Feature(
                value_type=feature_pb2.Feature.ValueType.DOUBLE,
                description="average revenue per user on third month",
                monitoring_config=featurestore_monitoring_pb2.FeaturestoreMonitoringConfig(
                    snapshot_analysis=featurestore_monitoring_pb2.FeaturestoreMonitoringConfig.SnapshotAnalysis(
                        disabled=False,
                    ),
                ),
            ),
            feature_id="arpu_m3",
        ),
        featurestore_service_pb2.CreateFeatureRequest(
            feature=feature_pb2.Feature(
                value_type=feature_pb2.Feature.ValueType.DOUBLE,
                description="average revenue per user on forth month",
                monitoring_config=featurestore_monitoring_pb2.FeaturestoreMonitoringConfig(
                    snapshot_analysis=featurestore_monitoring_pb2.FeaturestoreMonitoringConfig.SnapshotAnalysis(
                        disabled=False,
                    ),
                ),
            ),
            feature_id="arpu_m4",
        ),
        featurestore_service_pb2.CreateFeatureRequest(
            feature=feature_pb2.Feature(
                value_type=feature_pb2.Feature.ValueType.DOUBLE,
                description="Minutes of usage - voice calls on first month",
                monitoring_config=featurestore_monitoring_pb2.FeaturestoreMonitoringConfig(
                    snapshot_analysis=featurestore_monitoring_pb2.FeaturestoreMonitoringConfig.SnapshotAnalysis(
                        disabled=False,
                    ),
                ),
            ),
            feature_id="mou_m1",
        ),
        featurestore_service_pb2.CreateFeatureRequest(
            feature=feature_pb2.Feature(
                value_type=feature_pb2.Feature.ValueType.DOUBLE,
                description="Minutes of usage - voice calls month",
                monitoring_config=featurestore_monitoring_pb2.FeaturestoreMonitoringConfig(
                    snapshot_analysis=featurestore_monitoring_pb2.FeaturestoreMonitoringConfig.SnapshotAnalysis(
                        disabled=False,
                    ),
                ),
            ),
            feature_id="mou_m2",
        ),
        featurestore_service_pb2.CreateFeatureRequest(
            feature=feature_pb2.Feature(
                value_type=feature_pb2.Feature.ValueType.DOUBLE,
                description="Minutes of usage - voice calls on third month",
                monitoring_config=featurestore_monitoring_pb2.FeaturestoreMonitoringConfig(
                    snapshot_analysis=featurestore_monitoring_pb2.FeaturestoreMonitoringConfig.SnapshotAnalysis(
                        disabled=False,
                    ),
                ),
            ),
            feature_id="mou_m3",
        ),
        featurestore_service_pb2.CreateFeatureRequest(
            feature=feature_pb2.Feature(
                value_type=feature_pb2.Feature.ValueType.BOOL,
                description="if the user churn on the forth month. Judged by the spend > 0 on forth month",
                monitoring_config=featurestore_monitoring_pb2.FeaturestoreMonitoringConfig(
                    snapshot_analysis=featurestore_monitoring_pb2.FeaturestoreMonitoringConfig.SnapshotAnalysis(
                        disabled=False,
                    ),
                ),
            ),
            feature_id="is_churn",
        ),
    ],
).result()

features {
  name: "projects/429963084013/locations/us-central1/featurestores/telecom_churn_20220815081456/entityTypes/users/features/mobile_number"
}
features {
  name: "projects/429963084013/locations/us-central1/featurestores/telecom_churn_20220815081456/entityTypes/users/features/arpu_m1"
}
features {
  name: "projects/429963084013/locations/us-central1/featurestores/telecom_churn_20220815081456/entityTypes/users/features/arpu_m2"
}
features {
  name: "projects/429963084013/locations/us-central1/featurestores/telecom_churn_20220815081456/entityTypes/users/features/arpu_m3"
}
features {
  name: "projects/429963084013/locations/us-central1/featurestores/telecom_churn_20220815081456/entityTypes/users/features/arpu_m4"
}
features {
  name: "projects/429963084013/locations/us-central1/featurestores/telecom_churn_20220815081456/entityTypes/users/features/mou_m1"
}
features {
  name: "projects/429963084013/locations/us-central1/featurestores/telecom_churn_20220815081456/entityTypes/users/

### 2.3.1 Search created features
While the [ListFeatures](https://cloud.google.com/vertex-ai/docs/reference/rpc/google.cloud.aiplatform.v1beta1#google.cloud.aiplatform.v1beta1.FeaturestoreService.ListFeatures) method allows you to easily view all features of a single entity type, the [SearchFeatures](https://cloud.google.com/vertex-ai/docs/reference/rpc/google.cloud.aiplatform.v1beta1#google.cloud.aiplatform.v1beta1.FeaturestoreService.SearchFeatures) method searches across all featurestores and entity types in a given location (such as us-central1). This can help you discover features that were created by someone else.

You can query based on feature properties including feature ID, entity type ID, and feature description. You can also limit results by filtering on a specific featurestore, feature value type, and/or labels.

In [16]:
# Filter on feature value type and keywords.
list(
    admin_client.search_features(
        featurestore_service_pb2.SearchFeaturesRequest(
            location=BASE_RESOURCE_PATH, query="feature_id:is_ AND value_type=STRING"
        )
    )
)

[]

---
### 2.4 Select the data that we want to ingest into our feature store
You can specify a monitoring config which will by default be inherited by all Features under this EntityType.

In [28]:
BQ_RAW_DATA = "bq://erwinh-demo-joonix.ml_sample.telecom_churn" # --> Change this to the table that you created earlier. 
FEATURE_DESTINATION = "bq://erwinh-demo-joonix.ml_sample.import_features" # --> Change this. Set it to something like bq://<your-project>.churn.import_features

FEATURE_DESTINATION

'bq://erwinh-demo-joonix.ml_sample.import_features'

In [29]:
client = bigquery.Client(PROJECT_ID)

job_config = bigquery.QueryJobConfig(destination=FEATURE_DESTINATION.split('/')[-1])

sql = """
    SELECT cast(mobile_number as string) mobile_number,arpu_6,arpu_7,arpu_8,arpu_9<=0 as is_churn,onnet_mou_6,onnet_mou_7,onnet_mou_8,CURRENT_TIMESTAMP() as update_time
    FROM `{}`;
""".format(BQ_RAW_DATA.split('/')[-1])

query_job = client.query(sql, job_config=job_config)  # Make an API request.

query_job.result()  # Wait for the job to complete.

<google.cloud.bigquery.table.RowIterator at 0x7fe85b722590>

---
### 2.5 Ingest data into the feature store

In [30]:
# Create admin_client for CRUD and data_client for reading feature values.
admin_client = FeaturestoreServiceClient(client_options={"api_endpoint": API_ENDPOINT})
data_client = FeaturestoreOnlineServingServiceClient(
    client_options={"api_endpoint": API_ENDPOINT}
)

BASE_RESOURCE_PATH = admin_client.common_location_path(PROJECT_ID, REGION)

In [31]:
# Represents featurestore resource path.
import_users_request = featurestore_service_pb2.ImportFeatureValuesRequest(
    entity_type=admin_client.entity_type_path(
        PROJECT_ID, REGION, FEATURESTORE_ID_TO_CREATE, "users"
    ),
    bigquery_source=io_pb2.BigQuerySource(
        # Source
        input_uri=FEATURE_DESTINATION
    ),
    entity_id_field="mobile_number",
    feature_specs=[
        # Features
        featurestore_service_pb2.ImportFeatureValuesRequest.FeatureSpec(id="arpu_m1", source_field="arpu_6"),
        featurestore_service_pb2.ImportFeatureValuesRequest.FeatureSpec(id="arpu_m2", source_field="arpu_7"),
        featurestore_service_pb2.ImportFeatureValuesRequest.FeatureSpec(id="arpu_m3", source_field="arpu_8"),
        featurestore_service_pb2.ImportFeatureValuesRequest.FeatureSpec(id="is_churn", source_field="is_churn"),
        featurestore_service_pb2.ImportFeatureValuesRequest.FeatureSpec(id="mou_m1", source_field="onnet_mou_6"),
        featurestore_service_pb2.ImportFeatureValuesRequest.FeatureSpec(id="mou_m2", source_field="onnet_mou_7"),
        featurestore_service_pb2.ImportFeatureValuesRequest.FeatureSpec(id="mou_m3", source_field="onnet_mou_8"),
    ],
    feature_time_field="update_time",
    worker_count=10,
)
ingestion_lro = admin_client.import_feature_values(import_users_request)
ingestion_lro.result()

imported_entity_count: 99999
imported_feature_value_count: 686819

In [None]:
# Search for all features across all featurestores.
list(admin_client.search_features(location=BASE_RESOURCE_PATH))

In [33]:
bqclient = bigquery.Client()
bqstorageclient = bigquery_storage.BigQueryReadClient()
query_string = """
SELECT
    mobile_number
FROM `{}`
""".format(BQ_RAW_DATA.split('/')[-1])

user_df = (
    bqclient.query(query_string)
    .result()
    .to_dataframe(bqstorage_client=bqstorageclient)
)

X_train = user_df['mobile_number']

Please update `TRAINING_DATA_TABLE`. 

In [37]:
TRAINING_DATA_TABLE = 'erwinh-demo-joonix.ml_sample.training_data' # --> Please update! 
FEATURESTORE_ID = '' # fill in or leave empty if you just created the feature store
if FEATURESTORE_ID == '':
    FEATURESTORE_ID = FEATURESTORE_ID_TO_CREATE
TRAINING_DATA_SELECTOR_LOC = BUCKET_NAME + '/dataset/query_instance_2.csv'

In [38]:
X_train.head()

0    7001625959
1    7001204172
2    7001864400
3    7001419799
4    7001654241
Name: mobile_number, dtype: int64

In [None]:
now = datetime.now()
current_time = now.strftime("%Y-%m-%dT%H:%M:%SZ")
res = pd.DataFrame()
res['users']  = X_train
res['timestamp'] = current_time
res.to_csv(TRAINING_DATA_SELECTOR_LOC, index=False)
admin_client = FeaturestoreServiceClient(client_options={"api_endpoint": API_ENDPOINT})
batch_serving_request = featurestore_service_pb2.BatchReadFeatureValuesRequest(
    # featurestore info
    featurestore=admin_client.featurestore_path(PROJECT_ID, REGION, FEATURESTORE_ID),
    # URL for the label data, i.e., Table 1.
    csv_read_instances=io_pb2.CsvSource(
        gcs_source=io_pb2.GcsSource(uris=[TRAINING_DATA_SELECTOR_LOC])
    ),
    destination=featurestore_service_pb2.FeatureValueDestination(
        bigquery_destination=io_pb2.BigQueryDestination(
            # Output to BigQuery table created earlier
            output_uri='bq://'+TRAINING_DATA_TABLE
        )
    ),
    entity_type_specs=[
        featurestore_service_pb2.BatchReadFeatureValuesRequest.EntityTypeSpec(
            entity_type_id="users",
            feature_selector=FeatureSelector(
                id_matcher=IdMatcher(
                    ids=[
                        # features, use "*" if you want to select all features within this entity type
                        "mou_m1",
                        "mou_m2",
                        "mou_m3",
                        "arpu_m1",
                        "arpu_m2",
                        "arpu_m3",
                        "is_churn"
                    ]
                )
            ),
        ),
    ],
)
batch_serving_lro = admin_client.batch_read_feature_values(batch_serving_request)
batch_serving_lro.result()

## Cleaning up

To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud
project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.

Otherwise, you can delete the individual resources you created in this tutorial:
- Delete Cloud Storage objects that were created.  Uncomment and run the command in the cell below **only if you are not using the `PIPELINE_ROOT` path for any other purpose**.
- Delete your deployed model: first, undeploy it from its *endpoint*, then delete the model and endpoint.

In [None]:
admin_client.delete_featurestore(
    request=featurestore_service_pb2.DeleteFeaturestoreRequest(
        name=admin_client.featurestore_path(PROJECT_ID, REGION, FEATURESTORE_ID),
        force=True,
    )
).result()

