Permalink
Newer
Older
100644 598 lines (458 sloc) 25.6 KB
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
1 # Porting R Scripts
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
2
3 This document outlines how to port R scripts written in previous versions of H2O (Nunes 2.8.6.2 or prior, also known as "H2O Classic") for compatibility with the new H2O 3.0 API. When upgrading from H2O to H2O 3.0, most functions are the same. However, there are some differences that will need to be resolved when porting any scripts that were originally created using H2O to H2O 3.0.
4
5 The original R script for H2O is listed first, followed by the updated script for H2O 3.0.
6
7 Some of the parameters have been renamed for consistency. For each algorithm, a table that describes the differences is provided.
8
9 For additional assistance within R, enter a question mark before the command (for example, `?h2o.glm`).
10
11 There is also a "shim" available that will review R scripts created with previous versions of H2O, identify deprecated or renamed parameters, and suggest replacements. For more information, refer to the repo [here](https://github.com/h2oai/h2o-dev/blob/d9693a97da939a2b77c24507c8b40a5992192489/h2o-r/h2o-package/R/shim.R).
12
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
13 ## Changes from H2O 2.8 to H2O 3.0
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
14
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
15 ### `h2o.exec`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
16 The `h2o.exec` command is no longer supported. Any workflows using `h2o.exec` must be revised to remove this command. If the H2O 3.0 workflow contains any parameters or commands from H2O Classic, errors will result and the workflow will fail.
17
18 The purpose of `h2o.exec` was to wrap expressions so that they could be evaluated in a single `\Exec2` call. For example,
19 `h2o.exec(fr[,1] + 2/fr[,3])`
20 and
21 `fr[,1] + 2/fr[,3]`
22 produced the same results in H2O. However, the first example makes a single REST call and uses a single temp object, while the second makes several REST calls and uses several temp objects.
23
24 Due to the improved architecture in H2O 3.0, the need to use `h2o.exec` has been eliminated, as the expression can be processed by R as an "unwrapped" typical R expression.
25
26 Currently, the only known exception is when `factor` is used in conjunction with `h2o.exec`. For example, `h2o.exec(fr$myIntCol <- factor(fr$myIntCol))` would become `fr$myIntCol <- as.factor(fr$myIntCol)`
27
28 Note also that an array is not inside a string:
29
30 An int array is [1, 2, 3], *not* "[1, 2, 3]".
31
32 A String array is ["f00", "b4r"], *not* "[\"f00\", \"b4r\"]"
33
34 Only string values are enclosed in double quotation marks (`"`).
35
36 <a name="h2operf"></a>
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
37 ### `h2o.performance`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
38
39 To access any exclusively binomial output, use `h2o.performance`, optionally with the corresponding accessor. The accessor can only use the model metrics object created by `h2o.performance`. Each accessor is named for its corresponding field (for example, `h2o.AUC`, `h2o.gini`, `h2o.F1`). `h2o.performance` supports all current algorithms except for K-Means.
40
41 If you specify a data frame as a second parameter, H2O will use the specified data frame for scoring. If you do not specify a second parameter, the training metrics for the model metrics object are used.
42
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
43 ### `xval` and `validation` slots
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
44
45 The `xval` slot has been removed, as `nfolds` is not currently supported.
46
47 The `validation` slot has been merged with the `model` slot.
48
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
49 ### Principal Components Regression (PCR)
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
50
51 Principal Components Regression (PCR) has also been deprecated. To obtain PCR values, create a Principal Components Analysis (PCA) model, then create a GLM model from the scored data from the PCA model.
52
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
53 ### Saving and Loading Models
Jun 10, 2015 @jessica0xdata Update Recent Changes & Porting R Scripts doc with model save/restore
54
55 Saving and loading a model from R is supported in version 3.0.0.18 and later. H2O 3.0 uses the same binary serialization method as previous versions of H2O, but saves the model and its dependencies into a directory, with each object as a separate file. The `save_CV` option for available in previous versions of H2O has been deprecated, as `h2o.saveAll` and `h2o.loadAll` are not currently supported. The following commands are now supported:
56
57 - `h2o.saveModel`
58 - `h2o.loadModel`
59
60
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
61
62 **Table of Contents**
63
64 - [GBM](#GBM)
65 - [GLM](#GLM)
66 - [K-Means](#Kmeans)
67 - [Deep Learning](#DL)
68 - [Distributed Random Forest](#DRF)
69
70
71
72 <a name="GBM"></a>
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
73 ## GBM
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
74
75 N-fold cross-validation and grid search will be supported in a future version of H2O 3.0.
76
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
77 ### Renamed GBM Parameters
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
78
79 The following parameters have been renamed, but retain the same functions:
80
81 H2O Classic Parameter Name | H2O 3.0 Parameter Name
82 -------------------|-----------------------
83 `data` | `training_frame`
84 `key` | `model_id`
85 `n.trees` | `ntrees`
86 `interaction.depth` | `max_depth`
87 `n.minobsinnode` | `min_rows`
88 `shrinkage` | `learn_rate`
89 `n.bins` | `nbins`
90 `validation` | `validation_frame`
91 `balance.classes` | `balance_classes`
92 `max.after.balance.size` | `max_after_balance_size`
93
94
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
95 ### Deprecated GBM Parameters
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
96
97 The following parameters have been removed:
98
99 - `group_split`: Bit-set group splitting of categorical variables is now the default.
100 - `importance`: Variable importances are now computed automatically and displayed in the model output.
101 - `holdout.fraction`: The fraction of the training data to hold out for validation is no longer supported.
102 - `grid.parallelism`: Specifying the number of parallel threads to run during a grid search is no longer supported. Grid search will be supported in a future version of H2O 3.0.
103
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
104 ### New GBM Parameters
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
105
106 The following parameters have been added:
107
108 - `seed`: A random number to control sampling and initialization when `balance_classes` is enabled.
109 - `score_each_iteration`: Display error rate information after each tree in the requested set is built.
Jun 17, 2015 @jessica0xdata Update Recent Changes; update docs for `build_tree_one_node` addition…
110 - `build_tree_one_node`: Run on a single node to use fewer CPUs.
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
111
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
112 ### GBM Algorithm Comparison
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
113
114 H2O Classic | H2O 3.0
115 ------------- | -------------
116 `h2o.gbm <- function(` | `h2o.gbm <- function(`
117 `x,` |`x,`
118 `y,` |`y,`
119 `data,` | `training_frame,`
120 `key = "",` | `model_id,`
Dec 28, 2015 @jessica0xdata Updated with recent changes & fixed formatting
121 &nbsp; | `checkpoint`
122 `distribution = 'multinomial',` | `distribution = c("AUTO", "gaussian", "bernoulli", "multinomial", "poisson", "gamma", "tweedie"),`
123 &nbsp; | `tweedie_power = 1.5,`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
124 `n.trees = 10,` | `ntrees = 50`
125 `interaction.depth = 5,` | `max_depth = 5,`
126 `n.minobsinnode = 10,` | `min_rows = 10,`
Dec 28, 2015 @jessica0xdata Updated with recent changes & fixed formatting
127 `shrinkage = 0.1,` | `learn_rate = 0.1,`
128 &nbsp; | `sample_rate = 1`
129 &nbsp; | `col_sample_rate = 1`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
130 `n.bins = 20,`| `nbins = 20,`
Dec 28, 2015 @jessica0xdata Updated with recent changes & fixed formatting
131 &nbsp; | `nbins_top_level,`
132 &nbsp; | `nbins_cats = 1024,`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
133 `validation,` | `validation_frame = NULL,`
134 `balance.classes = FALSE` | `balance_classes = FALSE,`
135 `max.after.balance.size = 5,` | `max_after_balance_size = 1,`
136 &nbsp; | `seed,`
Jun 17, 2015 @jessica0xdata Update Recent Changes; update docs for `build_tree_one_node` addition…
137 &nbsp; | `build_tree_one_node = FALSE,`
Dec 28, 2015 @jessica0xdata Updated with recent changes & fixed formatting
138 &nbsp; | `nfolds = 0,`
139 &nbsp; | `fold_column = NULL,`
140 &nbsp; | `fold_assignment = c("AUTO", "Random", "Modulo"),`
141 &nbsp; | `keep_cross_validation_predictions = FALSE,`
142 &nbsp; | `score_each_iteration = FALSE,`
143 &nbsp; | `stopping_rounds = 0,`
144 &nbsp; | `stopping_metric = c("AUTO", "deviance", "logloss", "MSE", "AUC", "r2", "misclassification"),`
145 &nbsp; | `stopping_tolerance = 0.001,`
146 &nbsp; | `offset_column = NULL,`
147 &nbsp; | `weights_column = NULL,`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
148 `group_split = TRUE,` |
149 `importance = FALSE,` |
150 `holdout.fraction = 0,` |
151 `class.sampling.factors = NULL,` |
152 `grid.parallelism = 1)` |
153
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
154 ### Output
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
155
156 The following table provides the component name in H2O, the corresponding component name in H2O 3.0 (if supported), and the model type (binomial, multinomial, or all). Many components are now included in `h2o.performance`; for more information, refer to [(`h2o.performance`)](#h2operf).
157
158 H2O Classic | H2O 3.0 | Model Type
159 ------------- | ------------- | -------------
160 `@model$priorDistribution`| &nbsp; | `all`
161 `@model$params` | `@allparameters` | `all`
162 `@model$err` | `@model$scoring_history` | `all`
163 `@model$classification` | &nbsp; | `all`
164 `@model$varimp` | `@model$variable_importances` | `all`
Jul 17, 2015 @jessica0xdata Updates for nightly build & other doc fixes
165 `@model$confusion` | `@model$training_metrics@metrics$cm$table` | `binomial` and `multinomial`
166 `@model$auc` | `@model$training_metrics@metrics$AUC` | `binomial`
167 `@model$gini` | `@model$training_metrics@metrics$Gini` | `binomial`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
168 `@model$best_cutoff` | &nbsp; | `binomial`
Jul 17, 2015 @jessica0xdata Updates for nightly build & other doc fixes
169 `@model$F1` | `@model$training_metrics@metrics$thresholds_and_metric_scores$f1` | `binomial`
170 `@model$F2` | `@model$training_metrics@metrics$thresholds_and_metric_scores$f2` | `binomial`
171 `@model$accuracy` | `@model$training_metrics@metrics$thresholds_and_metric_scores$accuracy` | `binomial`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
172 `@model$error` | &nbsp; | `binomial`
Jul 17, 2015 @jessica0xdata Updates for nightly build & other doc fixes
173 `@model$precision` | `@model$training_metrics@metrics$thresholds_and_metric_scores$precision` | `binomial`
174 `@model$recall` | `@model$training_metrics@metrics$thresholds_and_metric_scores$recall` | `binomial`
175 `@model$mcc` | `@model$training_metrics@metrics$thresholds_and_metric_scores$absolute_MCC` | `binomial`
176 `@model$max_per_class_err` | currently replaced by `@model$training_metrics@metrics$thresholds_and_metric_scores$min_per_class_correct` | `binomial`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
177
178
179
180
181
182 ---
183
184 <a name="GLM"></a>
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
185 ## GLM
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
186
187 N-fold cross-validation and grid search will be supported in a future version of H2O 3.0.
188
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
189 ### Renamed GLM Parameters
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
190
191 The following parameters have been renamed, but retain the same functions:
192
193 H2O Classic Parameter Name | H2O 3.0 Parameter Name
194 -------------------|-----------------------
195 `data` | `training_frame`
196 `key` | `model_id`
197 `nlambda` | `nlambdas`
198 `lambda.min.ratio` | `lambda_min_ratio`
199 `iter.max` | `max_iterations`
200 `epsilon` | `beta_epsilon`
201
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
202 ### Deprecated GLM Parameters
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
203
204 The following parameters have been removed:
205
206 - `return_all_lambda`: A logical value indicating whether to return every model built during the lambda search. (may be re-added)
207 - `higher_accuracy`: For improved accuracy, adjust the `beta_epsilon` value.
208 - `strong_rules`: Discards predictors likely to have 0 coefficients prior to model building. (may be re-added as enabled by default)
209 - `non_negative`: Specify a non-negative response. (may be re-added)
210 - `variable_importances`: Variable importances are now computed automatically and displayed in the model output. They have been renamed to *Normalized Coefficient Magnitudes*.
211 - `disable_line_search`: This parameter has been deprecated, as it was mainly used for testing purposes.
212 - `offset`: Specify a column as an offset. (may be re-added)
213 - `max_predictors`: Stops training the algorithm if the number of predictors exceeds the specified value. (may be re-added)
214
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
215 ### New GLM Parameters
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
216
217 The following parameters have been added:
218
219 - `validation_frame`: Specify the validation dataset.
220 - `solver`: Select IRLSM or LBFGS.
221
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
222 ### GLM Algorithm Comparison
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
223
224
225 H2O Classic | H2O 3.0
226 ------------- | -------------
227 `h2o.glm <- function(` | `h2o.startGLMJob <- function(`
228 `x,` | `x,`
229 `y,` | `y,`
230 `data,` |`training_frame,`
231 `key = "",` | `model_id,`
232 &nbsp; | `validation_frame`
233 `iter.max = 100,` | `max_iterations = 50,`
234 `epsilon = 1e-4` | `beta_epsilon = 0`
235 `strong_rules = TRUE,` |
236 `return_all_lambda = FALSE,` |
Jun 13, 2015 @jessica0xdata One more update for intercept (R docs)
237 `intercept = TRUE,` | `intercept = TRUE`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
238 `non_negative = FALSE,` |
239 &nbsp; | `solver = c("IRLSM", "L_BFGS"),`
240 `standardize = TRUE,` | `standardize = TRUE,`
241 `family,` | `family = c("gaussian", "binomial", "poisson", "gamma", "tweedie"),`
242 `link,` | `link = c("family_default", "identity", "logit", "log", "inverse", "tweedie"),`
243 `tweedie.p = ifelse(family == "tweedie",1.5, NA_real_)` | `tweedie_variance_power = NaN,`
244 &nbsp; | `tweedie_link_power = NaN,`
245 `alpha = 0.5,` | `alpha = 0.5,`
246 `prior = NULL` | `prior = 0.0,`
247 `lambda = 1e-5,` | `lambda = 1e-05,`
248 `lambda_search = FALSE,` | `lambda_search = FALSE,`
249 `nlambda = -1,` | `nlambdas = -1,`
250 `lambda.min.ratio = -1,` | `lambda_min_ratio = 1.0,`
251 `use_all_factor_levels = FALSE` | `use_all_factor_levels = FALSE,`
252 `nfolds = 0,` | `nfolds = 0,`
253 `beta_constraints = NULL,` | `beta_constraint = NULL)`
254 `higher_accuracy = FALSE,` |
255 `variable_importances = FALSE,` |
256 `disable_line_search = FALSE,` |
257 `offset = NULL,` |
258 `max_predictors = -1)` |
259
260
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
261 ### Output
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
262
263
264 The following table provides the component name in H2O, the corresponding component name in H2O 3.0 (if supported), and the model type (binomial, multinomial, or all). Many components are now included in `h2o.performance`; for more information, refer to [(`h2o.performance`)](#h2operf).
265
266 H2O Classic | H2O 3.0 | Model Type
267 ------------- | ------------- | -------------
268 `@model$params` | `@allparameters` | `all`
269 `@model$coefficients` | `@model$coefficients` | `all`
270 `@model$nomalized_coefficients` | `@model$coefficients_table$norm_coefficients` | `all`
271 `@model$rank` | `@model$rank` | `all`
272 `@model$iter` |`@model$iter` | `all`
273 `@model$lambda` | &nbsp; | `all`
274 `@model$deviance` | `@model$residual_deviance` | `all`
275 `@model$null.deviance` | `@model$null_deviance` | `all`
276 `@model$df.residual` | `@model$residual_degrees_of_freedom` | `all`
277 `@model$df.null` | `@model$null_degrees_of_freedom` | `all`
278 `@model$aic` | `@model$AIC`| `all`
279 `@model$train.err` | &nbsp; | `binomial`
280 `@model$prior` | &nbsp; | `binomial`
281 `@model$thresholds` | `@model$threshold` | `binomial`
282 `@model$best_threshold` | &nbsp; | `binomial`
283 `@model$auc` | `@model$AUC` | `binomial`
284 `@model$confusion` | &nbsp; | `binomial`
285
286 <a name="Kmeans"></a>
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
287 ## K-Means
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
288
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
289 ### Renamed K-Means Parameters
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
290
291 The following parameters have been renamed, but retain the same functions:
292
293 H2O Classic Parameter Name | H2O 3.0 Parameter Name
294 -------------------|-----------------------
295 `data` | `training_frame`
296 `key` | `model_id`
297 `centers` | `k`
298 `cols` | `x`
299 `iter.max` | `max_iterations`
300 `normalize` | `standardize`
301
302 **Note** In H2O, the `normalize` parameter was disabled by default. The `standardize` parameter is enabled by default in H2O 3.0 to provide more accurate results for datasets containing columns with large values.
303
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
304 ### New K-Means Parameters
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
305
306 The following parameters have been added:
307
308 - `user` has been added as an additional option for the `init` parameter. Using this parameter forces the K-Means algorithm to start at the user-specified points.
309 - `user_points`: Specify starting points for the K-Means algorithm.
310
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
311 ### K-Means Algorithm Comparison
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
312
313 H2O Classic | H2O 3.0
314 ------------- | -------------
315 `h2o.kmeans <- function(` | `h2o.kmeans <- function(`
316 `data,` | `training_frame,`
317 `cols = '',` | `x,`
318 `centers,` | `k,`
319 `key = "",` | `model_id,`
320 `iter.max = 10,` | `max_iterations = 1000,`
321 `normalize = FALSE,` | `standardize = TRUE,`
322 `init = "none",` | `init = c("Furthest","Random", "PlusPlus"),`
323 `seed = 0,` | `seed)`
324
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
325 ### Output
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
326
327
328 The following table provides the component name in H2O and the corresponding component name in H2O 3.0 (if supported).
329
330 H2O Classic | H2O 3.0
331 ------------- | -------------
332 `@model$params` | `@allparameters`
333 `@model$centers` | `@model$centers`
334 `@model$tot.withinss` | `@model$tot_withinss`
335 `@model$size` | `@model$size`
336 `@model$iter` | `@model$iterations`
337 &nbsp; | `@model$_scoring_history`
338 &nbsp; | `@model$_model_summary`
339
340 ---
341
342 <a name="DL"></a>
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
343 ## Deep Learning
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
344
345 N-fold cross-validation and grid search will be supported in a future version of H2O 3.0.
346
347 **Note**: If the results in the confusion matrix are incorrect, verify that `score_training_samples` is equal to 0. By default, only the first 10,000 rows are included.
348
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
349 ### Renamed Deep Learning Parameters
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
350
351 The following parameters have been renamed, but retain the same functions:
352
353 H2O Classic Parameter Name | H2O 3.0 Parameter Name
354 -------------------|-----------------------
355 `data` | `training_frame`
356 `key` | `model_id`
357 `validation` | `validation_frame`
358 `class.sampling.factors` | `class_sampling_factors`
359 `override_with_best_model` | `overwrite_with_best_model`
Jun 25, 2015 @jessica0xdata Update to Porting/Migration doc to fix MSE typo for DL; added info re…
360 `dlmodel@model$valid_class_error` | `@model$validation_metrics@$MSE`
361
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
362
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
363 ### Deprecated DL Parameters
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
364
365 The following parameters have been removed:
366
367 - `classification`: Classification is now inferred from the data type.
368 - `holdout_fraction`: Fraction of the training data to hold out for validation.
Jun 25, 2015 @jessica0xdata Update to Porting/Migration doc to fix MSE typo for DL; added info re…
369 - `dlmodel@model$best_cutoff`: This output parameter has been removed.
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
370
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
371 ### New DL Parameters
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
372
373 The following parameters have been added:
374
375 - `export_weights_and_biases`: An additional option allowing users to export the raw weights and biases as H2O frames.
376
377 The following options for the `loss` parameter have been added:
378
379 - `absolute`: Provides strong penalties for mispredictions
380 - `huber`: Can improve results for regression
381
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
382 ### DL Algorithm Comparison
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
383
384 H2O Classic | H2O 3.0
385 ------------- | -------------
Dec 28, 2015 @jessica0xdata Updated with recent changes & fixed formatting
386 `h2o.deeplearning <- function(x,` | `h2o.deeplearning (x, `
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
387 `y,` | `y,`
388 `data,` | `training_frame,`
389 `key = "",` | `model_id = "",`
390 `override_with_best_model,` | `overwrite_with_best_model = true,`
391 `classification = TRUE,` |
Jul 17, 2015 @jessica0xdata Updates for nightly build & other doc fixes
392 `nfolds = 0,` | `nfolds = 0`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
393 `validation,` | `validation_frame,`
394 `holdout_fraction = 0,` |
395 `checkpoint = " "` | `checkpoint,`
396 `autoencoder,` | `autoencoder = false,`
397 `use_all_factor_levels,` | `use_all_factor_levels = true`
398 `activation,` | `_activation = c("Rectifier", "Tanh", "TanhWithDropout", "RectifierWithDropout", "Maxout", "MaxoutWithDropout"),`
399 `hidden,` | `hidden= c(200, 200),`
400 `epochs,` | `epochs = 10.0,`
401 `train_samples_per_iteration,` |`train_samples_per_iteration = -2,`
Dec 28, 2015 @jessica0xdata Updated with recent changes & fixed formatting
402 &nbsp; | `target_ratio_comm_to_comp = 0.05`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
403 `seed,` | `_seed,`
404 `adaptive_rate,` | `adaptive_rate = true,`
405 `rho,` | `rho = 0.99,`
Dec 28, 2015 @jessica0xdata Updated with recent changes & fixed formatting
406 `epsilon,` | `epsilon = 1e-08,`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
407 `rate,` | `rate = .005,`
Dec 28, 2015 @jessica0xdata Updated with recent changes & fixed formatting
408 `rate_annealing,` | `rate_annealing = 1e-06,`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
409 `rate_decay,` | `rate_decay = 1.0,`
410 `momentum_start,` | `momentum_start = 0,`
Dec 28, 2015 @jessica0xdata Updated with recent changes & fixed formatting
411 `momentum_ramp,` | `momentum_ramp = 1e+06,`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
412 `momentum_stable,` | `momentum_stable = 0,`
413 `nesterov_accelerated_gradient,` | `nesterov_accelerated_gradient = true,`
414 `input_dropout_ratio,` | `input_dropout_ratio = 0.0,`
415 `hidden_dropout_ratios,` | `hidden_dropout_ratios,`
416 `l1,` | `l1 = 0.0,`
417 `l2,` | `l2 = 0.0,`
418 `max_w2,` | `max_w2 = Inf,`
419 `initial_weight_distribution,` | `initial_weight_distribution = c("UniformAdaptive","Uniform", "Normal"),`
420 `initial_weight_scale,` | `initial_weight_scale = 1.0,`
Dec 28, 2015 @jessica0xdata Updated with recent changes & fixed formatting
421 `loss,` | `loss = "Automatic", "CrossEntropy", "Quadratic", "Absolute", "Huber"),`
422 &nbsp; | `distribution = c("AUTO", "gaussian", "bernoulli", "multinomial", "poisson", "gamma", "tweedie", "laplace", "huber"),`
423 &nbsp; | `tweedie_power = 1.5,`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
424 `score_interval,` | `score_interval = 5,`
425 `score_training_samples,` | `score_training_samples = 10000l,`
426 `score_validation_samples,` | `score_validation_samples = 0l,`
427 `score_duty_cycle,` | `score_duty_cycle = 0.1,`
428 `classification_stop,` | `classification_stop = 0`
429 `regression_stop,` | `regression_stop = 1e-6,`
Dec 28, 2015 @jessica0xdata Updated with recent changes & fixed formatting
430 &nbsp; | `stopping_rounds = 5,`
431 &nbsp; | `stopping_metric = c("AUTO", "deviance", "logloss", "MSE", "AUC", "r2", "misclassification"),`
432 &nbsp; | `stopping_tolerance = 0,`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
433 `quiet_mode,` | `quiet_mode = false,`
434 `max_confusion_matrix_size,` | `max_confusion_matrix_size,`
435 `max_hit_ratio_k,` | `max_hit_ratio_k,`
436 `balance_classes,` | `balance_classes = false,`
437 `class_sampling_factors,` | `class_sampling_factors,`
438 `max_after_balance_size,` | `max_after_balance_size,`
439 `score_validation_sampling,` | `score_validation_sampling,`
440 `diagnostics,` | `diagnostics = true,`
441 `variable_importances,` | `variable_importances = false,`
442 `fast_mode,` | `fast_mode = true,`
443 `ignore_const_cols,` | `ignore_const_cols = true,`
444 `force_load_balance,` | `force_load_balance = true,`
445 `replicate_training_data,` | `replicate_training_data = true,`
446 `single_node_mode,` | `single_node_mode = false,`
447 `shuffle_training_data,` | `shuffle_training_data = false,`
448 `sparse,` | `sparse = false,`
449 `col_major,` | `col_major = false,`
Dec 28, 2015 @jessica0xdata Updated with recent changes & fixed formatting
450 `max_categorical_features,` | `max_categorical_features,`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
451 `reproducible)` | `reproducible=FALSE,`
452 `average_activation` | `average_activation = 0,`
453 &nbsp; | `sparsity_beta = 0`
Dec 28, 2015 @jessica0xdata Updated with recent changes & fixed formatting
454 &nbsp; | `export_weights_and_biases=FALSE,`
455 &nbsp; | `offset_column = NULL,`
456 &nbsp; | `weights_column = NULL,`
457 &nbsp; | `nfolds = 0,`
458 &nbsp; | `fold_column = NULL,`
459 &nbsp; | `fold_assignment = c("AUTO", "Random", "Modulo"),`
460 &nbsp; | `keep_cross_validation_predictions = FALSE)`
461
462
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
463 ### Output
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
464
465
466 The following table provides the component name in H2O, the corresponding component name in H2O 3.0 (if supported), and the model type (binomial, multinomial, or all). Many components are now included in `h2o.performance`; for more information, refer to [(`h2o.performance`)](#h2operf).
467
468 H2O Classic | H2O 3.0 | Model Type
469 ------------- | ------------- | -------------
470 `@model$priorDistribution`| &nbsp; | `all`
471 `@model$params` | `@allparameters` | `all`
Jul 17, 2015 @jessica0xdata Updates for nightly build & other doc fixes
472 `@model$train_class_error` | `@model$training_metrics@metrics@$MSE` | `all`
Jun 25, 2015 @jessica0xdata Update to Porting/Migration doc to fix MSE typo for DL; added info re…
473 `@model$valid_class_error` | `@model$validation_metrics@$MSE` | `all`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
474 `@model$varimp` | `@model$_variable_importances` | `all`
Jul 17, 2015 @jessica0xdata Updates for nightly build & other doc fixes
475 `@model$confusion` | `@model$training_metrics@metrics$cm$table` | `binomial` and `multinomial`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
476 `@model$train_auc` | `@model$train_AUC` | `binomial`
477 &nbsp; | `@model$_validation_metrics` | `all`
478 &nbsp; | `@model$_model_summary` | `all`
479 &nbsp; | `@model$_scoring_history` | `all`
480
481
482 ---
483
484 <a name="DRF"></a>
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
485 ## Distributed Random Forest
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
486
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
487 ### Changes to DRF in H2O 3.0
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
488
Jun 11, 2015 @jessica0xdata Misc. typo fixes
489 Distributed Random Forest (DRF) was represented as `h2o.randomForest(type="BigData", ...)` in H2O Classic. In H2O Classic, SpeeDRF (`type="fast"`) was not as accurate, especially for complex data with categoricals, and did not address regression problems. DRF (`type="BigData"`) was at least as accurate as SpeeDRF (`type="fast"`) and was the only algorithm that scaled to big data (data too large to fit on a single node).
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
490 In H2O 3.0, our plan is to improve the performance of DRF so that the data fits on a single node (optimally, for all cases), which will make SpeeDRF obsolete. Ultimately, the goal is provide a single algorithm that provides the "best of both worlds" for all datasets and use cases.
Jun 15, 2015 @jessica0xdata Update team list, add note re: multi-file zip, h2o.predict for DRF re…
491 Please note that H2O does not currently support the ability to specify the number of trees when using `h2o.predict` for a DRF model.
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
492
493 **Note**: H2O 3.0 only supports DRF. SpeeDRF is no longer supported. The functionality of DRF in H2O 3.0 is similar to DRF functionality in H2O.
494
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
495 ### Renamed DRF Parameters
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
496
497 The following parameters have been renamed, but retain the same functions:
498
499 H2O Classic Parameter Name | H2O 3.0 Parameter Name
500 -------------------|-----------------------
501 `data` | `training_frame`
502 `key` | `model_id`
503 `validation` | `validation_frame`
504 `sample.rate` | `sample_rate`
505 `ntree` | `ntrees`
506 `depth` | `max_depth`
507 `balance.classes` | `balance_classes`
508 `score.each.iteration` | `score_each_iteration`
509 `class.sampling.factors` | `class_sampling_factors`
510 `nodesize` | `min_rows`
511
512
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
513 ### Deprecated DRF Parameters
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
514
515 The following parameters have been removed:
516
517 - `classification`: This is now automatically inferred from the response type. To achieve classification with a 0/1 response column, explicitly convert the response to a factor (`as.factor()`).
518 - `importance`: Variable importances are now computed automatically and displayed in the model output.
519 - `holdout.fraction`: Specifying the fraction of the training data to hold out for validation is no longer supported.
520 - `doGrpSplit`: The bit-set group splitting of categorical variables is now the default.
521 - `verbose`: Infonrmation about tree splits and extra statistics is now included automatically in the stdout.
522 - `oobee`: The out-of-bag error estimate is now computed automatically (if no validation set is specified).
523 - `stat.type`: This parameter was used for SpeeDRF, which is no longer supported.
524 - `type`: This parameter was used for SpeeDRF, which is no longer supported.
525
Jun 25, 2015 @jessica0xdata Update to Porting/Migration doc to fix MSE typo for DL; added info re…
526
527
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
528 ### New DRF Parameters
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
529
530 The following parameter has been added:
531
532 - `build_tree_one_node`: Run on a single node to use fewer CPUs.
533
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
534 ### DRF Algorithm Comparison
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
535
536 H2O Classic | H2O 3.0
537 ------------- | -------------
538 `h2o.randomForest <- function(x,` | `h2o.randomForest <- function(`
539 `x,` | `x,`
540 `y,` | `y,`
541 `data,` | `training_frame,`
542 `key="",` | `model_id,`
543 `validation,` | `validation_frame,`
544 `mtries = -1,` | `mtries = -1,`
May 21, 2015 @jessica0xdata Update default value for `sample_rate` (per Arno)
545 `sample.rate=2/3,` | `sample_rate = 0.632,`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
546 &nbsp; | `build_tree_one_node = FALSE,`
547 `ntree=50` | `ntrees=50,`
548 `depth=20,` | `max_depth = 20,`
549 &nbsp; | `min_rows = 1,`
550 `nbins=20,` | `nbins = 20,`
551 `balance.classes = FALSE,` | `balance_classes = FALSE,`
552 `score.each.iteration = FALSE,` | `score_each_iteration = FALSE,`
553 `seed = -1,` | `seed`
554 `nodesize = 1,` |
555 `classification=TRUE,` |
556 `importance=FALSE,` |
557 `nfolds=0,` |
558 `holdout.fraction = 0,` |
559 `max.after.balance.size = 5,` | `max_after_balance_size)`
560 `class.sampling.factors = NULL,` | &nbsp;
561 `doGrpSplit = TRUE,` |
562 `verbose = FALSE,` |
563 `oobee = TRUE,` |
564 `stat.type = "ENTROPY",` |
565 `type = "fast")` |
566
567
Jun 23, 2017 @angela0xdata PUBDEV-4213: Updates to markdown syntax (#1307)
568 ### Output
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
569
570
571 The following table provides the component name in H2O, the corresponding component name in H2O 3.0 (if supported), and the model type (binomial, multinomial, or all). Many components are now included in `h2o.performance`; for more information, refer to [(`h2o.performance`)](#h2operf).
572
573 H2O Classic | H2O 3.0 | Model Type
574 ------------- | ------------- | -------------
575 `@model$priorDistribution`| &nbsp; | `all`
576 `@model$params` | `@allparameters` | `all`
577 `@model$mse` | `@model$scoring_history` | `all`
578 `@model$forest` | `@model$model_summary` | `all`
579 `@model$classification` | &nbsp; | `all`
580 `@model$varimp` | `@model$variable_importances` | `all`
Jul 17, 2015 @jessica0xdata Updates for nightly build & other doc fixes
581 `@model$confusion` | `@model$training_metrics@metrics$cm$table` | `binomial` and `multinomial`
582 `@model$auc` | `@model$training_metrics@metrics$AUC` | `binomial`
583 `@model$gini` | `@model$training_metrics@metrics$Gini` | `binomial`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
584 `@model$best_cutoff` | &nbsp; | `binomial`
Jul 17, 2015 @jessica0xdata Updates for nightly build & other doc fixes
585 `@model$F1` | `@model$training_metrics@metrics$thresholds_and_metric_scores$f1` | `binomial`
586 `@model$F2` | `@model$training_metrics@metrics$thresholds_and_metric_scores$f2` | `binomial`
587 `@model$accuracy` | `@model$training_metrics@metrics$thresholds_and_metric_scores$accuracy` | `binomial`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
588 `@model$Error` | `@model$Error` | `binomial`
Jul 17, 2015 @jessica0xdata Updates for nightly build & other doc fixes
589 `@model$precision` | `@model$training_metrics@metrics$thresholds_and_metric_scores$precision` | `binomial`
590 `@model$recall` | `@model$training_metrics@metrics$thresholds_and_metric_scores$recall` | `binomial`
591 `@model$mcc` | `@model$training_metrics@metrics$thresholds_and_metric_scores$absolute_MCC` | `binomial`
592 `@model$max_per_class_err` | currently replaced by `@model$training_metrics@metrics$thresholds_and_metric_scores$min_per_class_correct` | `binomial`
May 16, 2015 @jessica0xdata Restore Porting R Scripts doc, add Migration Guide
593
594
595
596
597