### Introduction

This notebook is used for developing the IBM Analytics Engine Library. I use [Jupyterlab](https://jupyterlab.readthedocs.io/en/stable/) for development.

After changing any of the library code, restart the kernel of this notebook to pick up the changes.


### Example Code

#### Setup for developing the IBM Analytics Engine Library (e.g. on local machine)

Setup python path for working directly with the IBM Analytics Engine Library source - this step is only required for developing locally.

In [1]:
import os, sys
cwd = os.getcwd()
if sys.path[0] != cwd:
    sys.path.insert(0,cwd)
# print(sys.path)

#### Setup for using the IBM Analytics Engine Library (e.g. from Watson Studio)

In [2]:
# !pip install --upgrade --quiet git+https://github.com/snowch/ibm-analytics-engine-python@master

#### Using the IBM Analytics Engine Library

Create a new resource group API client.

First create an [API Key](https://console.bluemix.net/docs/iam/apikeys.html#platform-api-keys) and save it somewhere safe

In [3]:
from ibm_analytics_engine.client import AnalyticsEngine
import os

home_directory = os.environ['HOME']

# If you are using from Watson Studio, you could alternatively use the `api_key` paramater
client = AnalyticsEngine(
    api_key_filename = '{}/.ibmcloud/apiKey.json'.format(home_directory)
    )

Find our Resource Groups

In [4]:
[ (rg['id'], rg['name']) for rg in client.get_resource_groups()['resources'] ]

[('16c478f93fbe44b2b74a5e29a4a20121', 'ibm-cloud-streaming-retail-demo'),
 ('3b90ff40e6cc43978189146817edc403', 'default'),
 ('778882adaf4a4fb0831830ccd39fdde2', 'CSnow Watson Data Demo'),
 ('c7a22cded9f64d7d88c25be8ae340297', 'Analyst Insights London 2018')]

Select the required resource group - this is required in the provisioning data

In [18]:
resource_group_name = 'default'

resource_group = [ (rg['id'], rg['name']) for rg in client.get_resource_groups()['resources'] if rg['name'] == resource_group_name ][0][0]
print (resource_group)

3b90ff40e6cc43978189146817edc403


Define cluster provisioning configuration data

In [6]:
# uncomment the variables below to set COS S3
#
#servicename = 'myservicename'
#cos_s3_userkey = 'changeme'
#cos_s3_endpoint = 'changeme'
#cos_s3_secretkey = 'changeme'

data = {
    "name": "MyAnalyticEngineInstance",
    "resource_plan_id": "3175a5cf-61e3-4e79-aa2a-dff9a4e1f0ae",
    "resource_group_id": resource_group,
    "region_id": "us-south",
    "parameters": {
        "hardware_config": "default",
        "num_compute_nodes": "1",
        "software_package": "ae-1.1-spark",
        # uncomment the block below to set COS S3
        #
        #"advanced_options": {
        #    "ambari_config": {
        #        "core-site": {
        #            "fs.cos.{}.access.key".format(servicename):  cos_s3_userkey,
        #            "fs.cos.{}.endpoint".format(servicename):    cos_s3_endpoint,
        #            "fs.cos.{}.secret.key".format(servicename):  cos_s3_secretkey
        #        }
        #    }
        #}
    }    
}

Provision the cluster - this call will return immediately, but the cluster could take 30 minutes or so to spin up

In [7]:
create_cluster_response = client.create(data)

# uncomment below to see the provisioning REST response

# print(create_cluster_response)

Get the cluster instance ID from the provisioning response

In [None]:
# some backend apis require different types of IDS
id = create_cluster_response['id']
instance_id = create_cluster_response['guid']

# print(id)
# print(instance_id)

Check the current status at this moment in time

In [9]:
client.cluster_status(instance_id) 

{'state': 'Preparing'}

This call will block until provisioning had finished, either successfully or unsuccessfully

In [10]:
client.cluster_status(instance_id, wait_until_finished_preparing=True)

{'state': 'Active'}

In [14]:
client.delete(id)

<Response [204]>