<a href="https://colab.research.google.com/github/treasure-data/td-notebooks/blob/master/workflow/deploy_treasure_box.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Deploy Treasure Box from Google Colaboratory

This notebook enables you to deploy [Tresure Boxes](https://boxes.treasuredata.com/hc/en-us) in 3 steps.

1. Input your Treasure Data API key in the first cell.
2. Modify **category**, **box_name**, **project_name**, **secrets**, and **site** in the form.
3. Run following cells with following "Runtime" -> "Run after" and click the link in the output of the last cell.

If you run this notebook for the first time, you'll see "Warning: This notebook was not authored by Google.". Arm Treasure Data doesn't save your data on the notebook, except for configuration you've set.

## Input your Treasure Data API key

You need to set TD API master key.

To get the key, see also:
https://support.treasuredata.com/hc/en-us/articles/360000763288-Get-API-Keys

After running the following cell, an interactive box will be appered. You can paster your TD API key and then push retun key.

In [1]:
import getpass

print("Input your Treasure Data API key")
apikey = getpass.getpass()

Input your Treasure Data API key
··········


## Set configuration

You can set target box name as well as secrets information.

For example, if you want to deploy the box 
https://github.com/treasure-data/treasure-boxes/tree/master/integration-box/pandas
you should set `category` as `"integration-box"`, `box_name` as `"pandas"`.

You also have to set unique name for `project_name`, otherwise, you'll overwrite someone else's workflow project.

Secrets information required by box may be different from default settings, so follow the README.md in the box on GitHub.

The last variable you have to set is `site`. You can choose appropreate value for your TD region.

If you need additional secrests, modify `secrets` variable.

In [0]:
#@title Configure tresaure-boxes info
#@markdown You can select category from dropdown list
category = "integration-box" #@param ["integration-box", "analytics-box", "machine-learning-box", "data-box"]
#@markdown Box name can be `pandas` if the workflow URL is `https://github.com/treasure-data/treasure-boxes/tree/master/integration-box/pandas`.
box_name = "pandas" #@param {type: "string"}
#@markdown Or, you can paste the box URL (optional). It works with master branch and will overwrite `category` and `box_name` if exists.
#@markdown
#@markdown E.g., `https://github.com/treasure-data/treasure-boxes/tree/master/integration-box/pandas`
box_url = "" #@param {type: "string"}
#@markdown Type unique projecrt name, that doesn't conflict with other project names.
project_name = "pandas-aki"  #@param {type: "string"}
#@markdown Select your site.
site = "us" #@param ["us", "jp", "eu01"]

APISERVERS = {
    "us": "https://api.treasuredata.com",
    "jp": "https://api.treasuredata.co.jp",
    "eu01": "https://api.eu01.treasuredata.com"
    }

import re

if box_url:
    m = re.search(r"(?:https://)github.com/treasure-data/treasure-boxes/tree/master/(?P<cat>.+?)/(?P<name>.+?)(/.+)?$", box_url)
    box_name = m.group("name")
    category = m.group("cat")
    if not box_name or not category:
        raise("box_url is invalid.")

If you want to set additional secrets, you can modify following cell. Note that, if you want to send secure information, we'd recommend you to use `getpass.getpass()` like `apikey`.

In [0]:
# Write additional secrets, if needed.

# Secrets information will be stored on Treasure Data
secrets = {
    "td.apikey": apikey,
    "td.apiserver": APISERVERS[site],
    # Add extra secrets if you want like:
    # "mysecrets": "SECRET-VALUE",
}


## Deploy the box

The following codes are to deploy the box with Python. It is based on [tdworkflow](https://github.com/chezou/tdworkflow) and [git-python](https://gitpython.readthedocs.io/en/stable/index.html)

You don't need to read and understand the following codes. Just run through until the last cell. Then, you'll find a link to the deployed workflow.

In [9]:
#@title
!pip install -U -q tdworkflow
!pip install -q gitpython

[K     |████████████████████████████████| 460kB 3.5MB/s 
[K     |████████████████████████████████| 71kB 9.1MB/s 
[?25h

In [10]:
#@title
import tempfile
import os
import shutil

from git import Git


tempdir = tempfile.gettempdir()

git_repo = "https://github.com/treasure-data/treasure-boxes/"

target_dir = os.path.join(tempdir, "treasure-boxes")
# shutil.rmtree(target_dir)

if not os.path.exists(target_dir):
    try:
        Git(tempdir).clone(git_repo)
        print("Clone repository succeeded")
    except Exception:
        print("Repository clone failed")
        raise

Clone repository succeeded


In [11]:
#@title
!ls {tempdir}/treasure-boxes

analytics-box	 LICENSE	       scenarios  td_load
data-box	 machine-learning-box  td	  td_run
integration-box  README.md	       td_ddl	  tool-box


In [0]:
#@title
import tdworkflow

target_box = os.path.join(category, box_name)
target_path = os.path.join(tempdir, "treasure-boxes", target_box)

client = tdworkflow.client.Client(site=site, apikey=apikey)

In [13]:
#@title
project = client.create_project(project_name, target_path)

Added /tmp/treasure-boxes/machine-learning-box/sales-prediction/ingest.dig as ingest.dig
Added /tmp/treasure-boxes/machine-learning-box/sales-prediction/py_scripts as py_scripts
Added /tmp/treasure-boxes/machine-learning-box/sales-prediction/py_scripts/data.py as data.py
Added /tmp/treasure-boxes/machine-learning-box/sales-prediction/data.sh as data.sh
Added /tmp/treasure-boxes/machine-learning-box/sales-prediction/config as config
Added /tmp/treasure-boxes/machine-learning-box/sales-prediction/config/params.yml as params.yml
Added /tmp/treasure-boxes/machine-learning-box/sales-prediction/config/params.test.yml as params.test.yml
Added /tmp/treasure-boxes/machine-learning-box/sales-prediction/test.dig as test.dig
Added /tmp/treasure-boxes/machine-learning-box/sales-prediction/predict_sales.dig as predict_sales.dig
Added /tmp/treasure-boxes/machine-learning-box/sales-prediction/predict_sales_simple.dig as predict_sales_simple.dig
Added /tmp/treasure-boxes/machine-learning-box/sales-pred

In [14]:
#@title
CONSOLE_URL = {
    "us": "https://console.treasuredata.com/app/workflows",
    "eu01": "https://console.eu01.treasuredata.com/app/workflows",
    "jp": "https://console.treasuredata.co.jp/app/workflows",
}

workflows = client.project_workflows(project)
workflows = list(filter(lambda w: w.name != "test", workflows))
if workflows:
    print(f"Project created! Open {CONSOLE_URL[site]}/{workflows[0].id}/info on your browser and clieck 'New Run' button.")
else:
    print("Project creation failed.")

Project created! Open https://console.treasuredata.com/app/workflows/1746220/info on your browser and clieck 'New Run' button.
