Skip to content

Latest commit

 

History

History
879 lines (722 loc) · 34.7 KB

h2o_3_rest_api_overview.md

File metadata and controls

879 lines (722 loc) · 34.7 KB

H2O 3 REST API Overview

The H2O REST API allows you to access all the capabilities of H2O from an external program or script, via JSON over HTTP.

The REST API is used by the Flow UI, as well as both the R and Python bindings: everything that you can do with those clients can be done by using the REST API, including data import, model building and generating predictions.

You can call the REST API:

  • from your browser
  • using browser tools such as Postman in Chrome
  • using curl
  • using the language of your choice

Generated payload POJOs for Java are available as part of the release in a separate bindings Jar file and are simple to generate for other langauges if desired.

Reference Documentation

Reference documentation for the REST API is available in the Help sidebar in Flow, as well as on the H2O.ai website, http://docs.h2o.ai/. The reference documentation is all generated from the H2O server via the Metadata facilities described below so it is always up to date.

Versioning and Stability

Both the endpoints and the payloads for the REST API are versioned for stability; the current stable version for both is 3. Versions will be supported for some time after a new major version is released to provide time to upgrade your clients.

In general you will want to write to a specific version, such as 3, and upgrade shortly after a new major version is released. Once we release a new major version of the REST API most new features will be added only to the new version.

Non-breaking changes

We continue to add features to the APIs, but we only allow non-breaking changes in a published API such as version 3. Breaking changes force a new major version number.

A non-breaking change will not change the behavior of a well-written client. Examples of non-breaking changes are:

  • adding additional output fields to a response
  • adding a parameter with a default value that maintains the old behavior if the parameter is omitted

We test backward compatibility by running a full set of tests against each new release (including nightlies) using old releases of the Flow, R, and Python clients.

The EXPERIMENTAL version

Features that are under development and are not yet stable use version 99, which indicates that they may change between releases. Once those features become stable, we change the version from 99 to the current stable version.

For request URLs, you may use EXPERIMENTAL as the version number to make it clear in your client code that you are making requests to an experimental endpoint:

GET http://127.0.0.1:54321/EXPERIMENTAL/Sample

URLs

Your H2O cluster is typically referenced by the host name and HTTP port of the first server in the cluster. By default this is http://localhost:54321 (or https://localhost:54321, if you have an enterprise license). Append the endpoint request URI to this to form your request URL.

H2O REST API URIs begin with a version followed by a resource type, such as /3/Frames or /3/Models or /3/Cloud. Typically a GET to this kind of resource collection URI will return all the instances of the resource type.

All endpoints that deal with a resource type will begin with the same prefix. As an example, GET /3/Frames returns the list of all Frames, while GET /3/Frames/my_frame returns the Frame named my_frame.

HTTP Verbs

As is standard for REST APIs, the HTTP verbs GET, HEAD, POST and DELETE are used to interact with the resources in the server.

  • GET requests fetch data and do not cause side effects. All parameters for the request are contained within the URL, either within the path (e.g., /3/Frames/my_frame_name/a_column_name) or as query parameters (e.g., /3/Frames/my_frame_name*?row_offset=10000&row_count=1000*)

  • HEAD requests return just the HTTP status for accessing the resource.

  • POST requests create a new object within the H2O cluster. Examples are importing or parsing a file into a Frame or training a new Model. Some parameters may be given in the URL, but most are given using a request schema. The fields of the request schema are sent in the POST body using x-www-form-urlencoded format, like an HTML form. More on this below in the Formats secion.

    A future version of H2O will move to using application/json.

  • DELETE requests delete an object, generally from the distributed object store.

  • PUT is used for requests that modify objects; it is not used yet.

HTTP Status Codes

H2O uses standard HTTP status codes for all its responses. Refer to Wikipedia for more information on their meanings.

The status codes currently used by H2O are:

  • 200 OK (all is well)
  • 400 Bad Request (the request URL is bad)
  • 404 Not Found (a specified object was not found)
  • 412 Precondition Failed (bad parameters or other problem handling the request)
  • 500 Internal Server Error (unanticipated failure occurred in the server)

Formats

The payloads for each endpoint are implemented as versioned schemas. These schemas are self-describing for simplicity and ease of implementation, especially if you persist them for later.

Schemas

Schemas specify all the relevant properties of each field of an input or response including name, type, default value, help string, direction (in, out or inout), whether or not input fields are required and how important they are to specify, allowed values for enumerated fields, and so on. Schema fields can be simple values or nested schemas, or arrays or dictionaries (maps) of these.

This example shows the model_id field returned by a model builder call:

        "parameters": [
            {
                "__meta": {
                    "schema_name": "ModelParameterSchemaV3",
                    "schema_type": "Iced",
                    "schema_version": 3
                },
                "actual_value": {
                    "URL": "/3/Models/prostate_glm",
                    "__meta": {
                        "schema_name": "ModelKeyV3",
                        "schema_type": "Key<Model>",
                        "schema_version": 3
                    },
                    "name": "prostate_glm",
                    "type": "Key<Model>"
                },
                "default_value": null,
                "help": "Destination id for this model; auto-generated if not specified",
                "label": "model_id",
                "level": "critical",
                "name": "model_id",
                "required": false,
                "type": "Key<Model>",
                "values": []
            },
            ...
        ],
        ...

POST bodies

The fields of the request schema are sent in the POST body using x-www-form-urlencoded format, like an HTML form. A future version of H2O will move to using application/json. In the meantime, complex fields such as arrays are POSTed in the same format they would be in the JSON. For example, an array of ints might be posted in a field as [1, 10, 100]. Note the array of strings for the ignored_columns parameter in this GLM model builder POST body:

model_id=prostate_glm&training_frame=prostate.hex&nfolds=0&response_column=CAPSULE&ignored_columns=%5B%22%22%5D&ignore_const_cols=true&family=binomial&solver=AUTO&alpha=&lambda=&lambda_search=false&standardize=true&non_negative=false&score_each_iteration=false&max_iterations=-1&link=family_default&intercept=true&objective_epsilon=0.00001&beta_epsilon=0.0001&gradient_epsilon=0.0001&prior=-1&max_active_predictors=-1

The value is ["ID"], urlencoded as %5B%22ID%22%5D.

Metadata

The formats of all payloads (schemas) are available dynamically from the server using the /Metadata/schemas endpoints. You can fetch additional metadata for model builder (model algorithm) parameters from the /ModelBuilders endpoints. This metadata allows you to write a client that automatically adapts to new fields.

As an example, Flow has no hardwired knowledge of any of the model algos. It discovers the list of algos and their parameter information dynamically. This means that if you extend H2O with new algorithms or new fields for the built-in algorithms Flow will Just Work (tm).

Similarly, all the endpoints (URL patterns) are described dynamically by the /Metadata/endpoints endpoints.

Error Condition Payloads

All errors return one of the non-2xx HTTP status codes mentioned above and return standardized error payloads. These contain an end-user-directed message, a developer-oriented message, the HTTP status, an optional dictionary of revelant values, and exception information if applicable.

Here is the result of requesting a Frame that is not present in the server:

GET http://127.0.0.1:54321/3/Frames/missing_frame

        {
            "__meta": {
                "schema_version": 3,
                "schema_name": "H2OErrorV3",
                "schema_type": "H2OError"
            },
            "timestamp": 1438634936808,
            "error_url": "/3/Frames/missing_frame",
            "msg": "Object 'missing_frame' not found for argument: key",
            "dev_msg": "Object 'missing_frame' not found for argument: key",
            "http_status": 404,
            "values": {
                "argument": "key",
                "name": "missing_frame"
            },
            "exception_type": "water.exceptions.H2OKeyNotFoundArgumentException",
            "exception_msg": "Object 'missing_frame' not found for argument: key",
            "stacktrace": [
                "water.api.FramesHandler.getFromDKV(FramesHandler.java:154)",
                "water.api.FramesHandler.doFetch(FramesHandler.java:239)",
                "water.api.FramesHandler.fetch(FramesHandler.java:225)",
                ...

Control query parameters

H2O also supports "meta" query parameters to control the result payload. Currently the only one is exclude_fields, but more will be supported in subsequent releases.

exclude_fields

The result payload of some calls can get quite large. For example, a Frame or a Model built with a Frame that has 5,000 categorical columns may have a very large list of domains, or categorical levels.

If you don't require the server to return certain fields you can use the exclude_fields query parameter to exclude them. This reduces the size of the result, sometimes considerably, which speeds up JSON parsing in the client and reduces the chance that limited memory clients such as web browsers will run out of memory while processing the result.

The exclude_fields parameter accepts a comma-separated list of field names. Nested field names are separated by slashes.

As an example, one call of Flow to /Frames/{frame_id} uses this query parameter:

exclude_fields=frames/vec_ids,frames/columns/data,frames/columns/domain,frames/columns/histogram_bins,frames/columns/percentiles

Example Endpoints

This section lists a few endpoints to give you an idea of the functions that are available through the REST API. The reference documentation contains the full list.

Remember, Flow and the R and Python bindings access H2O only through the REST API, so if you find functionality in those clients, you'll find it in the REST API as well. The only caveat is data munging (e.g., slicing, creating new columns, etc); that functionality is available through the /99/Rapids endpoint, which is under rapid change (pun intended). Contact us if you need to access those functions through the REST API.

Loading and parsing data files

GET /3/ImportFiles
Import raw data files into a single-column H2O Frame.

POST /3/ParseSetup
Guess the parameters for parsing raw byte-oriented data into an H2O Frame.

POST /3/Parse
Parse a raw byte-oriented Frame into a useful columnar data Frame.

Frames

GET /3/Frames
Return all Frames in the H2O distributed K/V store.

GET /3/Frames/(?.*)
Return the specified Frame.

GET /3/Frames/(?.*)/summary
Return a Frame, including the histograms, after forcing computation of rollups.

GET /3/Frames/(?.*)/columns/(?.*)/summary
Return the summary metrics for a column, e.g. mins, maxes, mean, sigma, percentiles, etc.

DELETE /3/Frames/(?.*)
Delete the specified Frame from the H2O distributed K/V store.

DELETE /3/Frames
Delete all Frames from the H2O distributed K/V store.

Building models

GET /3/ModelBuilders
Return the Model Builder metadata for all available algorithms.

GET /3/ModelBuilders/(?.*)
Return the Model Builder metadata for the specified algorithm.

POST /3/ModelBuilders/deeplearning/parameters
Validate a set of Deep Learning model builder parameters.

POST /3/ModelBuilders/deeplearning
Train a Deep Learning model on the specified Frame.

POST /3/ModelBuilders/glm/parameters
Validate a set of GLM model builder parameters.

POST /3/ModelBuilders/glm
Train a GLM model on the specified Frame.

...

Accessing and using models

GET /3/Models
Return all Models from the H2O distributed K/V store.

GET /3/Models/(?.*?)(\.java)?
Return the specified Model from the H2O distributed K/V store, optionally with the list of compatible Frames.  Using the .java extension will return the Java POJO.

POST /3/Predictions/models/(?.*)/frames/(?.*)
Score (generate predictions) for the specified Frame with the specified Model. Both the Frame of predictions and the metrics will be returned.

DELETE /3/Models/(?.*)
Delete the specified Model from the H2O distributed K/V store.

DELETE /3/Models
Delete all Models from the H2O distributed K/V store.

Administrative and utility

GET /3/About
Return information about this H2O cluster.

GET /3/Cloud
Determine the status of the nodes in the H2O cloud.

HEAD /3/Cloud
Determine the status of the nodes in the H2O cloud.

Job management and polling

GET /3/Jobs
Get a list of all the H2O Jobs (long-running actions).

GET /3/Jobs/(?.*)
Get the status of the given H2O Job (long-running action).

POST /3/Jobs/(?.*)/cancel
Cancel a running job.

Persistence

POST /3/Frames/(?.*)/export
Export a Frame to the given path with optional overwrite.

POST /99/Models.bin/(?.*)
Import given binary model into H2O.

GET /99/Models.bin/(?.*)
Export given model.

GBM_Example.flow

GBM_Example.flow, Step 1: Import

In Flow:

importFiles ["http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz"]

In curl:

curl -X GET http://127.0.0.1:54321/3/ImportFiles?path=http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz

Result JSON:

{
    "__meta": {
        "schema_name": "ImportFilesV3",
        "schema_type": "Iced",
        "schema_version": 3
    },
    "_exclude_fields": "",
    "dels": [],
    "destination_frames": [
        "http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz"
    ],
    "fails": [],
    "files": [
        "http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz"
    ],
    "path": "http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz"
}

GBM_Example.flow, Step 2: ParseSetup

In Flow:

setupParse paths: ["http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz"]

In curl:

curl -X POST http://127.0.0.1:54321/3/ParseSetup --data 'source_frames=["http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz"]'

GBM_Example.flow, Step 2 Result

{
  "__meta": {
    "schema_version": 3,
    "schema_name": "ParseSetupV3",
    "schema_type": "ParseSetup"
  },
  "_exclude_fields": "",
  "source_frames": [
    {
      "__meta": {
        "schema_version": 3,
        "schema_name": "FrameKeyV3",
        "schema_type": "Key<Frame>"
      },
      "name": "http:\/\/s3.amazonaws.com\/h2o-public-test-data\/smalldata\/flow_examples\/arrhythmia.csv.gz",
      "type": "Key<Frame>",
      "URL": "\/3\/Frames\/http:\/\/s3.amazonaws.com\/h2o-public-test-data\/smalldata\/flow_examples\/arrhythmia.csv.gz"
    }
  ],
  "parse_type": "CSV",
  "separator": 44,
  "single_quotes": false,
  "check_header": -1,
  "column_names": null,
  "column_types": [
    "Numeric",
    ...
  ],
  "na_strings": null,
  "column_name_filter": null,
  "column_offset": 0,
  "column_count": 0,
  "destination_frame": "arrhythmia.hex",
  "header_lines": 0,
  "number_columns": 280,
  "data": [
    [
      "75",
      "0",
      "190",
      ...
   ]
   ...
  ],
  "chunk_size": 4194304,
  "total_filtered_column_count": 280
}

GBM_Example.flow, Step 3: Parse

In Flow:

parseFiles
  paths: ["http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz"]
  destination_frame: "arrhythmia.hex"
  parse_type: "CSV"
  separator: 44
  number_columns: 280
  single_quotes: false
  column_names: null
  column_types: ["Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric"]
  delete_on_done: true
  check_header: -1
  chunk_size: 4194304

In curl:

curl -X POST http://127.0.0.1:54321/3/Parse --data 'destination_frame=arrhythmia.hex&source_frames=["http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz"]&parse_type=CSV&separator=44&number_columns=280&single_quotes=false&column_names=&column_types=["Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric"]&check_header=-1&delete_on_done=true&chunk_size=4194304'

GBM_Example.flow, Step 3 Result

{
  "__meta": {
    "schema_version": 3,
    "schema_name": "ParseV3",
    "schema_type": "Iced"
  },
  "_exclude_fields": "",
  "destination_frame": {
    "__meta": {
      "schema_version": 3,
      "schema_name": "FrameKeyV3",
      "schema_type": "Key<Frame>"
    },
    "name": "arrhythmia.hex",
    "type": "Key<Frame>",
    "URL": "\/3\/Frames\/arrhythmia.hex"
  },
  "source_frames": [
    {
      "__meta": {
        "schema_version": 3,
        "schema_name": "FrameKeyV3",
        "schema_type": "Key<Frame>"
      },
      "name": "http:\/\/s3.amazonaws.com\/h2o-public-test-data\/smalldata\/flow_examples\/arrhythmia.csv.gz",
      "type": "Key<Frame>",
      "URL": "\/3\/Frames\/http:\/\/s3.amazonaws.com\/h2o-public-test-data\/smalldata\/flow_examples\/arrhythmia.csv.gz"
    }
  ],
  "parse_type": "CSV",
  "separator": 44,
  "single_quotes": false,
  "check_header": -1,
  "number_columns": 280,
  "column_names": null,
  "column_types": [
    "Numeric",
    "Numeric",
...
    "Numeric",
    "Numeric",
    "Numeric"
  ],
  "domains": null,
  "na_strings": null,
  "chunk_size": 4194304,
  "delete_on_done": true,
  "blocking": false,
  "remove_frame": false,
  "job": {
    "__meta": {
      "schema_version": 3,
      "schema_name": "JobV3",
      "schema_type": "Job"
    },
    "key": {
      "__meta": {
        "schema_version": 3,
        "schema_name": "JobKeyV3",
        "schema_type": "Key<Job>"
      },
      "name": "$03010a010a7f32d4ffffffff$_b98fc5bba38d21ea53da2a0834c44f7a",
      "type": "Key<Job>",
      "URL": "\/3\/Jobs\/$03010a010a7f32d4ffffffff$_b98fc5bba38d21ea53da2a0834c44f7a"
    },
    "description": "Parse",
    "status": "RUNNING",
    "progress": 0,
    "progress_msg": "Ingesting files.",
    "start_time": 1438888896402,
    "msec": 65,
    "dest": {
      "__meta": {
        "schema_version": 3,
        "schema_name": "FrameKeyV3",
        "schema_type": "Key<Frame>"
      },
      "name": "arrhythmia.hex",
      "type": "Key<Frame>",
      "URL": "\/3\/Frames\/arrhythmia.hex"
    },
    "exception": null,
    "messages": [
      
    ],
    "error_count": 0
  },
  "rows": 0,
  "vec_ids": null
}

GBM_Example.flow, Step 4: Poll for job completion

Flow polls for Job completion automagically:

inline fill

In curl:

curl -X GET 'http://127.0.0.1:54321/3/Jobs/%2403010a010a7f32d4ffffffff%24_b98fc5bba38d21ea53da2a0834c44f7a'

GBM_Example.flow, Step 4: Result

{
  "__meta": {
    "schema_version": 3,
    "schema_name": "JobsV3",
    "schema_type": "Iced"
  },
  "_exclude_fields": "",
  "job_id": {
    "URL": "\/3\/Jobs\/$03010a010a7f32d4ffffffff$_b98fc5bba38d21ea53da2a0834c44f7a"
  },
  "jobs": [
    {
      "__meta": {
        "schema_version": 3,
        "schema_name": "JobV3",
        "schema_type": "Job"
      },
      "key": {
        "__meta": {
          "schema_version": 3,
          "schema_name": "JobKeyV3",
          "schema_type": "Key<Job>"
        },
        "name": "$03010a010a7f32d4ffffffff$_b98fc5bba38d21ea53da2a0834c44f7a",
        "type": "Key<Job>",
        "URL": "\/3\/Jobs\/$03010a010a7f32d4ffffffff$_b98fc5bba38d21ea53da2a0834c44f7a"
      },
      "description": "Parse",
      "status": "RUNNING",
      "progress": 1,
      "progress_msg": "Ingesting files.",
      "start_time": 1438888896402,
      "msec": 267,
      "dest": {
        "__meta": {
          "schema_version": 3,
          "schema_name": "FrameKeyV3",
          "schema_type": "Key<Frame>"
        },
        "name": "arrhythmia.hex",
        "type": "Key<Frame>",
        "URL": "\/3\/Frames\/arrhythmia.hex"
      },
      "exception": null,
      "messages": [
        
      ],
      "error_count": 0
    }
  ]
}

GBM_Example.flow, Step 5: Train the Model

In Flow:

buildModel 'gbm', {"model_id":"gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1","training_frame":"arrhythmia.hex","score_each_iteration":false,"response_column":"C1","ntrees":"20","max_depth":5,"min_rows":"25","nbins":20,"learn_rate":"0.3","distribution":"AUTO","balance_classes":false,"max_confusion_matrix_size":20,"max_hit_ratio_k":10,"class_sampling_factors":[],"max_after_balance_size":5,"seed":0}

In curl:

curl -X POST http://127.0.0.1:54321/3/ModelBuilders/gbm --data 'model_id=gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1&training_frame=arrhythmia.hex&score_each_iteration=false&response_column=C1&ntrees=20&max_depth=5&min_rows=25&nbins=20&learn_rate=0.3&distribution=AUTO&balance_classes=false&max_confusion_matrix_size=20&max_hit_ratio_k=10&class_sampling_factors=&max_after_balance_size=5&seed=0'

GBM_Example.flow, Step 5: Result

{
  "__meta": {
    "schema_version": 3,
    "schema_name": "GBMV3",
    "schema_type": "GBM"
  },
  "_exclude_fields": "",
  "job": {
    "__meta": {
      "schema_version": 3,
      "schema_name": "JobV3",
      "schema_type": "Job"
    },
    "key": {
      "__meta": {
        "schema_version": 3,
        "schema_name": "JobKeyV3",
        "schema_type": "Key<Job>"
      },
      "name": "$03010a010a7f32d4ffffffff$_881e60f52af792b71d20540604b742dd",
      "type": "Key<Job>",
      "URL": "\/3\/Jobs\/$03010a010a7f32d4ffffffff$_881e60f52af792b71d20540604b742dd"
    },
    "description": "GBM",
    "status": "RUNNING",
    "progress": 0,
    "progress_msg": "Running...",
    "start_time": 1438888898858,
    "msec": 185,
    "dest": {
      "__meta": {
        "schema_version": 3,
        "schema_name": "ModelKeyV3",
        "schema_type": "Key<Model>"
      },
      "name": "gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1",
      "type": "Key<Model>",
      "URL": "\/3\/Models\/gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1"
    },
    "exception": null,
    "messages": [
      
    ],
    "error_count": 0
  },
  "algo": "gbm",
  "algo_full_name": "Gradient Boosting Machine",
  "can_build": [
    "Regression",
    "Binomial",
    "Multinomial"
  ],
  "visibility": "Stable",
  "messages": [
    
  ],
  "error_count": 0,
  "parameters": [
    {
      "__meta": {
        "schema_version": 3,
        "schema_name": "ModelParameterSchemaV3",
        "schema_type": "Iced"
      },
      "name": "model_id",
      "label": "model_id",
      "help": "Destination id for this model; auto-generated if not specified",
      "required": false,
      "type": "Key<Model>",
      "default_value": null,
      "actual_value": {
        "__meta": {
          "schema_version": 3,
          "schema_name": "ModelKeyV3",
          "schema_type": "Key<Model>"
        },
        "name": "gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1",
        "type": "Key<Model>",
        "URL": "\/3\/Models\/gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1"
      },
      "level": "critical",
      "values": [
        
      ],
      "is_member_of_frames": [
        
      ],
      "is_mutually_exclusive_with": [
        
      ]
    }, ...
  ]
}

GBM_Example.flow, Step 6: Poll for job completion

Same as for Parse

GBM_Example.flow, Step 7: View the Model

In Flow:

getModel "gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1"

In curl:

curl -X GET 'http://127.0.0.1:54321/3/Models/gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1'

GBM_Example.flow, Step 7: Result

{
  "models": [
    {
      "model_id": {
        "URL": "\/3\/Models\/gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1"
      },
      "algo": "gbm",
      "algo_full_name": "Gradient Boosting Machine",
      "parameters": [
	  ...
      ],
      "output": {
        "__meta": {
          "schema_name": "GBMModelOutputV3",
        },
        "model_category": "Regression",
	  ...
        "scoring_history": {
	    ...
        },
        "training_metrics": {
          "model_category": "Regression",
          "MSE": 31.32188458883,
          "r2": 0.88422887487626,
          "mean_residual_deviance": 31.32188458883
        },
        "status": "DONE",
        "run_time": 3211,
     },
    }
  ],
}

GBM_Example.flow, Step 8: Predictions

In Flow:

predict model: "gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1", frame: "arrhythmia.hex", predictions_frame: "prediction-9d6f23f3-45c2-4e1f-a48e-393b1b7de6db"

In curl:

curl -X POST 'http://127.0.0.1:54321/3/Predictions/models/gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1/frames/arrhythmia.hex' --data 'predictions_frame=prediction-9d6f23f3-45c2-4e1f-a48e-393b1b7de6db'

GBM_Example.flow, Step 8: Result

{
  "__meta": {
    "schema_version": 3,
    "schema_name": "ModelMetricsListSchemaV3",
    "schema_type": "ModelMetricsList"
  },
  "predictions_frame": {
    "__meta": {
      "schema_version": 3,
      "schema_name": "FrameKeyV3",
      "schema_type": "Key<Frame>"
    },
    "name": "prediction-9d6f23f3-45c2-4e1f-a48e-393b1b7de6db",
    "type": "Key<Frame>",
    "URL": "\/3\/Frames\/prediction-9d6f23f3-45c2-4e1f-a48e-393b1b7de6db"
  },
  "model_metrics": [
    {
      "__meta": {
        "schema_version": 3,
        "schema_name": "ModelMetricsRegressionV3",
        "schema_type": "ModelMetricsRegression"
      },
      "model": {
        "__meta": {
          "schema_version": 3,
          "schema_name": "ModelKeyV3",
          "schema_type": "Key<Model>"
        },
        "name": "gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1",
        "type": "Key<Model>",
        "URL": "\/3\/Models\/gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1"
      },
      "model_checksum": 7.1488755500207e+18,
      "frame": {
        "__meta": {
          "schema_version": 3,
          "schema_name": "FrameKeyV3",
          "schema_type": "Key<Frame>"
        },
        "name": "arrhythmia.hex",
        "type": "Key<Frame>",
        "URL": "\/3\/Frames\/arrhythmia.hex"
      },
      "frame_checksum": -1.6112849483913e+17,
      "model_category": "Regression",
      "scoring_time": 1438888905373,
      "predictions": {
        "__meta": {
          "schema_version": 3,
          "schema_name": "FrameV3",
          "schema_type": "Frame"
        },
        "frame_id": {
          "__meta": {
            "schema_version": 3,
            "schema_name": "FrameKeyV3",
            "schema_type": "Key<Frame>"
          },
          "name": "prediction-9d6f23f3-45c2-4e1f-a48e-393b1b7de6db",
          "type": "Key<Frame>",
          "URL": "\/3\/Frames\/prediction-9d6f23f3-45c2-4e1f-a48e-393b1b7de6db"
        },
        "byte_size": 3684,
        "is_text": false,
        "row_offset": 0,
        "row_count": 100,
        "column_offset": 0,
        "column_count": 1,
        "total_column_count": 1,
        "checksum": 3.1483215706755e+18,
        "rows": 452,
        "columns": [
          {
            "__meta": {
              "schema_version": 3,
              "schema_name": "ColV3",
              "schema_type": "Vec"
            },
            "label": "predict",
            "missing_count": 0,
            "zero_count": 0,
            "positive_infinity_count": 0,
            "negative_infinity_count": 0,
            "mins": null,
            "maxs": null,
            "mean": 0,
            "sigma": 0,
            "type": "real",
            "domain": null,
            "domain_cardinality": 0,
            "data": [
              27.761375975688,
              55.923557338198,
              28.388683621664,
              35.275735166748,
              53.253980894466,
              41.531820529033
            ],
          }
        ],
      "MSE": 31.321880321916,
      "r2": 0.88422889064751,
      "mean_residual_deviance": 31.321880321916
    }
  ]
}

Documentation