#Recent Changes
##H2O-Dev
Serre (0.2.1.1) - 3/18/15
####New Features The following features have been added since the last release:
#####Algorithms
- Naive Bayes in H2O-dev (PUBDEV-158)
- GLM model output, details from R (HEXDEV-94)
- Run GLM Regression from Flow (including LBFGS) (HEXDEV-110)
- PCA (PUBDEV-157)
- Port Random Forest to h2o-dev (PUBDEV-455)
- Enable DRF model output (github)
- Add DRF to Flow (Model Output) (PUBDEV-533)
- Grid for GBM (github)
- Run Deep Learning Regression from Flow (HEXDEV-109)
#####Python
- Add Python wrapper for DRF (PUBDEV-534)
#####R
- Add R wrapper for DRF (PUBDEV-530)
#####System
- Include uploadFile (PUBDEV-299) (github)
- Added -flow_dir to hadoop driver (github)
#####Web UI
- Add Flow packs (HEXDEV-190) (PUBDEV-247)
- Integrate H2O Help inside Help panel (PUBDEV-108) (github)
- Add quick toggle button to show/hide the sidebar (github)
- Add New, Open toolbar buttons (github)
- Auto-refresh data preview when parse setup input parameters are changed (PUBDEV-532) -Flow: Add playbar with Run, Continue, Pause, Progress controls (HEXDEV-192)
- You can now stop/cancel a running flow
####Enhancements
The following changes are improvements to existing features (which includes changed default values):
#####Algorithms
- Display GLM coefficients only if available (PUBDEV-466)
- Add random chance line to RoC chart (HEXDEV-168)
- Allow validation dataset for AutoEncoder (PUDEV-581)
- Speed up DLSpiral test. Ignore Neurons test (MatVec) (github)
- Use getRNG for Dropout (github)
- PUBDEV-598: Add tests for determinism of RNGs (github)
- PUBDEV-598: Implement Chi-Square test for RNGs (github)
- PUBDEV-580: Add log loss to binomial and multinomial model metric (github)
- Add DL model output toString() (github)
- Add LogLoss to MultiNomial ModelMetrics (PUBDEV-580)
- Port MissingValueInserter EndPoint to h2o-dev (PUBDEV-465)
- Print number of categorical levels once we hit >1000 input neurons. (github)
- Updated the loss behavior for GBM. When loss is set to AUTO, if the response is an integer with 2 levels, then bernoullli (rather than gaussian) behavior is chosen. As a result, the
do_classificationflag is no longer necessary in Flow, since the loss completely specifies the desired behavior, and R users no longer to useas.factor()in their response to get the desired bernoulli behavior. Thescore_each_iterationflag has been removed as well. (github) - Fully remove
_convert_to_enumin all algos (github) - Add DL POJO scoring (PUBDEV-585)
#####API
- Display point layer for tree vs mse plots in GBM output (PUBDEV-504)
- Rename API inputs/outputs (github)
- Rename Inf to Infinity (github)
#####Python
- added H2OFrame.setNames(), H2OFrame.cbind(), H2OVec.cbind(), h2o.cbind(), and pyunit_cbind.py (github)
- Make H2OVec.levels() return the levels (github)
- H2OFrame.dim(), H2OFrame.append(), H2OVec.setName(), H2OVec.isna() additions. demo pyunit addition (github)
#####R
- PUBDEV-578, PUBDEV-541, PUBDEV-566. -R client now sends the data frame column names and data types to ParseSetup. -R client can get column names from a parsed frame or a list. -Respects client request for column data types (github)
#####System
- Customize H2O web UI port (PUBDEV-483)
- Make parse setup interactive (PUBDEV-532)
- Added --verbose (github)
- Adds some H2OParseExceptions. Removes all H2O.fail in parse (no parse issues should cause a fail)(github)
- Allows parse to specify check_headers=HAS_HEADERS, but not provide column names (github)
- Port MissingValueInserter EndPoint to h2o-dev (PUBDEV-465)
#####Web UI
- Add 'Clear cell' and 'Run all cells' toolbar buttons (github)
- Add 'Clear cell' and 'Clear all cells' commands (PUBDEV-493) (github)
- 'Run' button selects next cell after running
- ModelMetrics by model category: Clustering (PUBDEV-416)
- ModelMetrics by model category: Regression (PUBDEV-415)
- ModelMetrics by model category: Multinomial (PUBDEV-414)
- ModelMetrics by model category: Binomial (PUBDEV-413)
- Add ability to select and delete multiple models (github)
- Add ability to select and delete multiple frames (github)
- Flows now stop running when an error occurs
- Print full number of mismatches during POJO comparison check. (github)
- Make Grid multi-node safe (github)
- Beautify the vertical axis labels for Flow charts/visualization (more) (PUBDEV-329)
####Bug Fixes The following changes are to resolve incorrect software behavior:
#####Algorithms
- GBM only populates either MSE_train or MSE_valid but displays both (PUBDEV-350)
- GBM: train error increases after hitting zero on prostate dataset (PUBDEV-513)
- GBM : Variable importance displays 0's for response param => should not display response in table at all (PUBDEV-430)
- Inconsistency in GBM results:Gives different results even when run with the same set of params (HEXDEV-194)
- GLM : R/Flow ==> Build GLM Model hangs at 4% (PUBDEV-456)
- Import file from R hangs at 75% for 15M Rows/2.2 K Columns (HEXDEV-179)
- Flow: GLM - 'model.output.coefficients_magnitude.name' not found, so can't view model (PUBDEV-466)
- GBM predict fails without response column (PUBDEV-478)
- GBM: When validation set is provided, gbm should report both mse_valid and mse_train (PUBDEV-499)
- PCA Assertion Error during Model Metrics (PUBDEV-548) (github)
- KMeans: Size of clusters in Model Output is different from the labels generated on the training set (PUBDEV-542) (github)
- Inconsistency in GBM results:Gives different results even when run with the same set of params (HEXDEV-194)
- divide by zero in modelmetrics for deep learning (PUBDEV-568)
- AUC reported on training data is 0, but should be 1 (HEXDEV-223) (github)
- GBM: reports 0th tree mse value for the validation set, different than the train set ,When only train sets is provided (PUDEV-561)
- PUBDEV-580: Fix some numerical edge cases (github)
- Fix two missing float -> double conversion changes in tree scoring. (github)
- Problems during Train/Test adaptation between Enum/Numeric (HEXDEV-229)
- DRF/GBM balance_classes=True throws unimplemented exception (HEXDEV-226)
- Flow: HIDDEN_DROPOUT_RATIOS for DL does not show default value (PUBDEV-285)
- Old GLM Parameters Missing (PUBDEV-431)
- GLM: R/Flow ==> Build GLM Model hangs at 4% (PUBDEV-456)
- GBM: Initial mse in bernoulli seems to be off (PUBDEV-515)
#####API
- SplitFrame on String column produce C0LChunk instead of CStrChunk (PUBDEV-468)
- Error in node$h2o$node : $ operator is invalid for atomic vectors (PUBDEV-348)
- Response from /ModelBuilders don't conform to standard error json shape when there are errors (HEXDEV-121)
#####Python
- fix python syntax error (github)
- Fixes handling of None in python for a returned na_string. (github)
#####R
- R : Inconsistency - Train set name with and without quotes work but Validation set name with quotes does not work (PUBDEV-491)
- h2o.confusionmatrices does not work (PUBDEV-547)
- How do i convert an enum column back to integer/double from R? (PUBDEV-546)
- Summary in R is faulty (PUBDEV-539)
- Custom Functions don't work in apply() in R (PUBDEV-436)
- R: as.h2o should preserve R data types (PUBDEV-578)
- as.h2o loses track of headers (PUBDEV-541)
- NPE in GBM Prediction with Sliced Test Data (HEXDEV-207) (github)
- Import file from R hangs at 75% for 15M Rows/2.2 K Columns (HEXDEV-179)
- Custom Functions don't work in apply() in R (PUBDEV-436)
- got water.DException$DistributedException and then got java.lang.RuntimeException: Categorical renumber task (HEXDEV-195)
- h2o.confusionMatrices for multinomial does not work (PUBDEV-577)
- R: h2o.confusionMatrix should handle both models and model metric objects (PUBDEV-590)
- H2O-R: as.h2o parses column name as one of the row entries (PUBDEV-591)
#####System
- Flow: When balance class = F then flow should not show max_after_balance_size = 5 in the parameter listing (PUBDEV-503)
- 3 jvms, doing ModelMetrics on prostate, class water.KeySnapshot$GlobalUKeySetTask; class java.lang.AssertionError: *** Attempting to block on task (class water.TaskGetKey) with equal or lower priority. Can lead to deadlock! 122 <= 122 (PUBDEV-495)
- Not able to start h2o on hadoop (PUBDEV-487)
- one row (one col) dataset seems to get assertion error in parse setup request (PUBDEV-96)
- Parse : Import file (move.com) => Parse => First row contains column names => column names not selected (HEXDEV-171) (github)
- The NY0 parse rule, in summary. Doesn't look like it's counting the 0's as NAs like h2o (PUBDEV-154)
- 0 / Y / N parsing (PUBDEV-229)
- NodePersistentStorage gets wiped out when laptop is restarted. (HEXDEV-167)
- Parse : Parsing random crap gives java.lang.ArrayIndexOutOfBoundsException: 13 (PUBDEV-428)
- Flow: converting a column to enum while parsing does not work (PUBDEV-566)
- Parse: Numbers completely parsed wrong (PUBDEV-574)
- NodePersistentStorage gets wiped out when hadoop cluster is restarted (HEXDEV-185)
- Parse: Fail gracefully when asked to parse a zip file with different files in it (PUBDEV-540)(github)
- Building a model and making a prediction accepts invalid frame types (PUBDEV-83)
- Flow : Import file 15M rows 2.2 Cols => Parse => Error fetching job on UI =>Console : ERROR: Job was not successful Exiting with nonzero exit status (HEXDEV-55)
- Flow : Build GLM Model => Family tweedy => class hex.glm.LSMSolver$ADMMSolver$NonSPDMatrixException', with msg 'Matrix is not SPD, can't solve without regularization (PUBDEV-211)
- Flow : Import File : File doesn't exist on all the hdfs nodes => Fails without valid message (PUBDEV-313)
- Check reproducibility on multi-node vs single-node (PUBDEV-557)
- Parse: After parsing Chicago crime dataset => Not able to build models or Get frames (PUBDEV-576)
#####Web UI
- Flow : Build Model => Parameters => shows meta text for some params (PUBDEV-505)
- Flow: K-Means - "None" option should not appear in "Init" parameters (PUBDEV-459)
- Flow: PCA - "None" option appears twice in "Transform" list (HEXDEV-186)
- GBM Model : Params in flow show two times (PUBDEV-440)
- Flow multinomial confusion matrix visualization (HEXDEV-204)
- Flow: It would be good if flow can report the actual distribution, instead of just reporting "Auto" in the model parameter listing (PUBDEV-509)
- Unimplemented algos should be taken out from drop down of build model (PUBDEV-511)
- [MapR] unable to give hdfs file name from Flow (PUBDEV-409)
###Selberg (0.2.0.1) - 3/6/15 ####New Features
#####Web UI
- Flow: Delete functionality to be available for import files, jobs, models, frames (PUBDEV-241)
- Implement "Download Flow" (PUBDEV-407)
- Flow: Implement "Run All Cells" (PUBDEV-110)
#####API
- Create python package (PUBDEV-181)
- as.h2o in Python (HEXDEV-72)
#####System
####Enhancements
#####Web UI
- Flow: Job view should have info on start and end time (PUBDEV-267)
- Flow: Implement 'File > Open' (PUBDEV-408)
- Display IP address in ADMIN -> Cluster Status (HEXDEV-159)
- Flow: Display alternate UI for splitFrames() (PUBDEV-399)
#####Algorithms
- Added K-Means scoring (github)
- Flow: Implement model output for Deep Learning (PUBDEV-118)
- Flow: Implement model output for GLM (PUBDEV-120)
- Deep Learning model output (HEXDEV-89, Flow),(HEXDEV-88, Python),(HEXDEV-87, R)
- Run GLM Binomial from Flow (including LBFGS) (HEXDEV-90)
- Flow: Display confusion matrices for multinomial models (PUBDEV-397)
- During PCA, missing values in training data will be replaced with column mean (github)
- Update parameters for best model scan (github)
- Change Quantiles to match h2o-1; both Quantiles and Rollups now have the same default percentiles (github)
- Massive cleanup and removal of old PCA, replacing with quadratically regularized PCA based on alternating minimization algorithm in GLRM (github)
- Add model run time to DL Model Output (github)
- Don't gather Neurons/Weights/Biases statistics (github)
- Only store best model if
override_with_best_modelis enabled (github) beta_epsadded, passing tests changed (github)- For GLM, default values for
max_itersparameter were changed from 1000 to 50. - For quantiles, probabilities are displayed.
- Run Deep Learning Multinomial from Flow (HEXDEV-108)
#####API
- Expose DL weights/biases to clients via REST call (PUBDEV-344)
- Flow: Implement notification bar/API (PUBDEV-359)
- Variable importance data in REST output for GLM (PUBDEV-359)
- Add extra DL parameters to R API (
average_activation, sparsity_beta, max_categorical_features, reproducible) (github) - Update GLRM API model output (github)
- h2o.anomaly missing in R (PUBDEV-434)
- No method to get enum levels (PUBDEV-432)
#####System
- Improve memory footprint with latest version of h2o-dev (github)
- For now, let model.delete() of DL delete its best models too. This allows R code to not leak when only calling h2o.rm() on the main model. (github)
- Bind both TCP and UDP ports before clustering (github)
- Round summary row#. Helps with pctiles for very small row counts. Add a test to check for getting close to the 50% percentile on small rows. (github)
- Increase Max Value size in DKV to 256MB (github)
- Flow: make parseRaw() do both import and parse in sequence (HEXDEV-184)
- Remove notion of individual job/job tracking from Flow (PUBDEV-449)
- Capability to name prediction results Frame in flow (PUBDEV-233)
####Bug Fixes
#####Algorithms
- GLM binomial prediction failing (PUBDEV-403)
- DL: Predict with auto encoder enabled gives Error processing error (PUBDEV-433)
- balance_classes in Deep Learning intermittent poor result (PUBDEV-437)
- Flow: Building GLM model fails (PUBDEV-186)
- summary returning incorrect 0.5 quantile for 5 row dataset (PUBDEV-95)
- GBM missing variable importance and balance-classes (PUBDEV-309)
- H2O Dev GBM first tree differs from H2O 1 (PUBDEV-421)
- get glm model from flow fails to find coefficient name field (PUBDEV-394)
- GBM/GLM build model fails on Hadoop after building 100% => Failed to find schema for version: 3 and type: GBMModel (PUBDEV-378)
- Parsing KDD wrong (PUBDEV-393)
- GLM AIOOBE (PUBDEV-199)
- Flow : Build GLM Model with family poisson => java.lang.ArrayIndexOutOfBoundsException: 1 at hex.glm.GLM$GLMLambdaTask.needLineSearch(GLM.java:359) (PUBDEV-210)
- Flow : GLM Model Error => Enum conversion only works on small integers (PUBDEV-365)
- GLM binary response, do_classfication=FALSE, family=binomial, prediction error (PUBDEV-339)
- Epsilon missing from GLM parameters (PUBDEV-354)
- GLM NPE (PUBDEV-395)
- Flow: GLM bug (or incorrect output) (PUBDEV-252)
- GLM binomial prediction failing (PUBDEV-403)
- GLM binomial on benign.csv gets assertion error in predict (PUBDEV-132)
- current summary default_pctiles doesn't have 0.001 and 0.999 like h2o1 (PUBDEV-94)
- Flow: Build GBM/DL Model: java.lang.IllegalArgumentException: Enum conversion only works on integer columns (PUBDEV-213) (github)
- ModelMetrics on cup98VAL_z dataset has response with many nulls (PUBDEV-214)
- GBM : Predict model category output/inspect parameters shows as Regression when model is built with do classification enabled (PUBDEV-441)
- Fix double-precision DRF bugs (github)
#####System
- Null columnTypes for /smalldata/arcene/arcene_train.data (PUBDEV-406) (github)
- Flow: Waiting for -1 responses after starting h2o on hadoop cluster of 5 nodes (PUBDEV-419)
- Parse: airlines_all.csv => Airtime type shows as ENUM instead of Integer (PUBDEV-426) (github)
- Flow: Typo - "Time" option displays twice in column header type menu in Parse (PUBDEV-446)
- Duplicate validation messages in k-means output (PUBDEV-305) (github)
- Fixes Parse so that it returns to supplying generic column names when no column names exist (github)
- Flow: Import File: File doesn't exist on all the hdfs nodes => Fails without valid message (PUBDEV-313)
- Flow: Parse => 1m.svm hangs at 42% (HEXDEV-174)
- Prediction NFE (PUBDEV-308)
- NPE doing Frame to key before it's fully parsed (PUBDEV-79)
h2o_master_DEV_gradle_build_J8#351 hangs for past 17 hrs (PUBDEV-239)- Sparkling water - container exited due to unavailable port (PUBDEV-357)
#####API
- Flow: Splitframe => java.lang.ArrayIndexOutOfBoundsException (PUBDEV-410) (github)
- Incorrect dest.type, description in /CreateFrame jobs (PUBDEV-404)
- space in windows filename on python (PUBDEV-444)
- Python end-to-end data science example 1 runs correctly (PUBDEV-182)
- 3/NodePersistentStorage.json/foo/id should throw 404 instead of 500 for 'not-found' (HEXDEV-163)
- POST /3/NodePersistentStorage.json should handle Content-Type:multipart/form-data (HEXDEV-165)
- by class water.KeySnapshot$GlobalUKeySetTask; class java.lang.AssertionError: *** Attempting to block on task (class water.TaskGetKey) with equal or lower priority. Can lead to deadlock! 122 <= 122 (PUBDEV-92)
- Sparkling water : val train:DataFrame = prostateRDD => Fails with ArrayIndexOutOfBoundsException (PUBDEV-392)
- Flow : getModels produces error: Error calling GET /3/Models.json (PUBDEV-254)
- Flow : Splitframe => java.lang.ArrayIndexOutOfBoundsException (PUBDEV-410)
- ddply 'Could not find the operator' (HEXDEV-162) (github)
- h2o.table AIOOBE during NewChunk creation (HEXDEV-161) (github)
- Fix warning in h2o.ddply when supplying multiple grouping columns (github)
###0.1.26.1051 - 2/13/15
####New Features
- Flow: Display alternate UI for splitFrames() (PUBDEV-399)
####Enhancements
#####System
- Embedded H2O config can now provide flat file (needed for Hadoop) (github)
- Don't logging GET of individual jobs to avoid filling up the logs (github)
#####Algorithms
- Increase GBM/DRF factor binning back to historical levels. Had been capped accidentally at nbins (typically 20), was intended to support a much higher cap. (github)
- Tweaked rho heuristic in glm (github)
- Enable variable importances for autoencoders (github)
- Removed
group_splitoption from GBM - Flow: display varimp for GBM output (PUBDEV-398)
- variable importance for GBM (github)
- GLM in H2O-Dev may provide slightly different coefficient values when applying an L1 penalty in comparison with H2O1.
####Bug Fixes
#####Algorithms
- Fixed bug in GLM exception handling causing GLM jobs to hang (github)
- Fixed a bug in kmeans input parameter schema where init was always being set to Furthest (github)
- Fixed mean computation in GLM (github)
- Fixed kmeans.R (github)
- Flow: Building GBM model fails with Error executing javascript (PUBDEV-396)
#####System
###0.1.26.1032 - 2/6/15
####New Features
#####General Improvements
- better model output
- support for Python client
- support for Maven
- support for Sparkling Water
- support for REST API schema
- support for Hadoop CDH5 (github)
#####UI
- Display summary visualizations by default in column summary output cells (PUBDEV-337)
- Display AUC curve by default in binomial prediction output cells (PUBDEV-338)
- Flow: Implement About H2O/Flow with version information (PUBDEV-111)
- Add UI for CreateFrame (PUBDEV-218)
- Flow: Add ability to cancel running jobs (PUBDEV-373)
- Flow: warn when user navigates away while having unsaved content (PUBDEV-322)
#####Algorithms
- Implement splitFrame() in Flow (PUBDEV-356)
- Variable importance graph in Flow for GLM (PUBDEV-360)
- Flow: Implement model building form init and validation (PUBDEV-102)
- Added a shuffle-and-split-frame function; Use it to build a saner model on time-series data (github)
- Added binomial model metrics (github)
- Run KMeans from R (HEXDEV-105)
- Be able to create a new GLM model from an existing one with updated coefficients (HEXDEV-48)
- Run KMeans from Python (HEXDEV-106)
- Run Deep Learning Binomial from Flow (HEXDEV-83)
- Run KMeans from Flow (HEXDEV-104)
- Run Deep Learning from Python (HEXDEV-85)
- Run Deep Learning from R (HEXDEV-84)
- Run Deep Learning Multinomial from Flow (HEXDEV-108)
- Run Deep Learning Regression from Flow (HEXDEV-109)
#####API
- Flow: added REST API documentation to the web ui (PUBDEV-60)
- Flow: Implement visualization API (PUBDEV-114)
#####System
- Dataset inspection from Flow (HEXDEV-66)
- Basic data munging (Rapids) from R (HEXDEV-70)
- Implement stack operator/stacking in Lightning (HEXDEV-128)
####Enhancements
#####UI
- Added better message when h2o.init() not yet called (
No active connection to an H2O cluster. Try calling "h2o.init()") (github)
#####Algorithms
- Updated column-based gradient task to use sparse interface (github)
- Updated LBFGS (added progress monitor interface, updated some default params), added progress and job support to GLM lbfgs (github)
- Added pretty print (github)
- Added AutoEncoder to R model categories (github)
- Added Coefficients table to GLM model (github)
- Updated glm lbfgs to allow for efficient lambda-search (l2 penalty only) (github)
- Removed splitframe shuffle parameter (github)
- Simplified model builders and added deeplearning model builder (github)
- Add DL model outputs to Flow (PUBDEV-372)
- Flow: Deep Learning: Expert Mode (PUBDEV-284)
- Flow: Display multinomial and regression DL model outputs (PUBDEV-383)
- Display varimp details for DL models (PUBDEV-381)
- Make binomial response "0" and "1" by default (github)
- Add Coefficients table to GLM model (github)
- Removed splitframe shuffle parameter (github)
- Update R GBM demos to reflect new input parameter names (github)
- Rename GLM variable importance to normalized coefficient magnitudes (github)
#####API
- Changed
keytodestination_key(github) - Cleaned up REST API schema interface (github)
- Changed method name, cleaned setup, added a pyunit runner (github)
#####System
- Allow changing column types during parse-setup (PUBDEV-376)
- Display %NAs in model builder column lists (PUBDEV-375)
- Figure out how to add H2O to PyPl (PUBDEV-178)
####Bug Fixes
#####UI
- Flow: Parse => 1m.svm hangs at 42% (PUBDEV-345)
- cup98 Dataset has columns that prevent validation/prediction (PUBDEV-349)
- Flow: predict step failed to function (PUBDEV-217)
- Flow: Arrays of numbers (ex. hidden in deeplearning)require brackets (PUBDEV-303)
- Flow v.0.1.26.1030: StackTrace was broken (PUBDEV-371)
- Flow: Import files -> Search -> Parse these files -> null pointer exception (PUBDEV-170)
- Flow: "getJobs" not working (PUBDEV-320)
- Thresholds x Metrics and Max Criteria x Metrics tables were flipped in flow (HEXDEV-155)
- Flow v.0.1.26.1030: StackTrace is broken (PUBDEV-348)
- flow: getJobs always shows "Your H2O cloud has no jobs" (PUBDEV-243)
- Flow: First and last characters deleted from ignored columns (PUBDEV-300)
- Sparkling water => Flow => Menu buttons for cell do not show up (PUBDEV-294)
#####Algorithms
- Flow: Build K Means model with default K value gives error "Required field k not specified" (PUBDEV-167)
- Slicing out a specific data point is broken (PUBDEV-280)
- Flow: SplitFrame and grep in algorithms for flow and loops back onto itself (PUBDEV-272)
- Fixed the predict method (github)
- Refactor ModelMetrics into a different class for Binomial (github)
- /Predictions.json did not cache predictions (HEXDEV-119)
- Flow, DL: Error after changing hidden layer size (PUBDEV-323)
- Error in node$h2o#node: $ operator is invalid for atomic vectors (PUBDEV-348)
- Fixed K-means predict (PUBDEV-321)
- Flow: DL build mode fails => as it's missing adding quotes to parameter (PUBDEV-301)
- Flow: Build K means model with training/validation frames => unknown error (PUBDEV-185)
- Flow: Build quantile mode=> Click goes in loop (PUBDEV-188)
#####API
- Sparkling Water/Flow: Failed to find version for schema (PUBDEV-367)
- Cloud.json returns odd node name (PUBDEV-259)
#####System
- guesser needs to send types to parse (PUBDEV-279)
- Got h2o.clusterStatus function working in R. (github)
- Parse: Using R => java.lang.NullPointerException (PUBDEV-380)
- Flow: Jobs => click on destination key => unimplemented: Unexpected val class for Inspect: class water.fvec.DataFrame (PUBDEV-363)
- Column assignment in R exposes NullPointerException in Rollup (PUBDEV-155)
- import from hdfs doesn't add files (PUBDEV-260)
- AssertionError: ERROR: got tcp resend with existing in-progress task (PUBDEV-219)
- HDFS parse fails when H2O launched on Spark CDH5 (PUBDEV-138)
- Flow: Parse failure => java.lang.ArrayIndexOutOfBoundsException (PUBDEV-296)
- "predict" step is not working in flow (PUBDEV-202)
- Flow: Frame finishes parsing but comes up as null in flow (PUBDEV-270)
- scala >flightsToORD.first() fails with "not serializable result" (PUBDEV-304)
- DL throws NPE for bad column names (PUBDEV-15)
- Flow: Build model: Not able to build KMeans/Deep Learning model (PUBDEV-297)
- Flow: Col summary for NA/Y cols breaks (PUBDEV-325)
- Sparkling Water : util.SparkUncaughtExceptionHandler: Uncaught exception in thread Thread NanoHTTPD Session,9,main (PUBDEV-346)
- toDataFrame doesn't support sequence format schema (array, vectorUDT) (PUBDEV-457)
###0.1.20.1019 - 1/19/15
####New Features
#####UI
- Added various documentation links to the build page (github)
#####Algorithms
- Ported matrix multiply over and connected it to rapids (github)
####Enhancements
#####UI
- Allow user to specify (the log of) the number of rows per chunk for a new constant chunk; use this new function in CreateFrame (github)
- Make CreateFrame non-blocking, now displays progress bar in Flow (github)
- Add row and column count to H2OFrame show method (github)
- Admin watermeter page (PUBDEV-234)
- Admin stack trace (PUBDEV-228)
- Admin profile (PUBDEV-227)
- Flow: Add download logs in UI (PUBDEV-204)
- Need shutdown, minimally like h2o (PUBDEV-74)
#####API
- Changed 2 to 3 for JSON requests (github)
- Rename some more fields per consistency (
max_iterschanged tomax_iterations,_itersto_iterations,_ncatsto_categorical_column_count,_centersrawtocenters_raw,_avgwithinsstoavg_within_ss,_withinmsetowithin_mse) (github) - Changed K-Means output parameters (
withinmsetowithin_mse,avgsstoavg_ss,avgbetweensstoavg_between_ss) (github) - Remove default field values from DeepLearning parameters schema, since they come from the backing class (github)
- Add @API help annotation strings to JSON model output (PUBDEV-216)
#####Algorithms
- Minor fix in rapids matrix multiplicaton (github)
- Updated sparse chunk to cut off binary search for prefix/suffix zeros (github)
- Updated L_BFGS for GLM - warm-start solutions during lambda search, correctly pass current lambda value, added column-based gradient task (github)
- Fix model parameters' default values in the metadata (github)
- Set default value of k = number of clusters to 1 for K-Means (PUBDEV-251)
#####System
- Reject any training data with non-numeric values from KMeans model building (github)
####Bug Fixes
#####API
- Fixed isSparse call for constant chunks (github)
- Fixed sparse interface of constant chunks (no nonzero if const 1= 0) (github)
#####System
- Typeahead for folder contents apparently requires trailing "/" (github)
- Fix build and instructions for R install.packages() style of installation; Note we only support source installs now (github)
- Fixed R test runner h2o package install issue that caused it to fail to install on dev builds (github)
###0.1.18.1013 - 1/14/15
####New Features
#####UI
- Admin timeline (PUBDEV-226)
- Admin cluster status (PUBDEV-225)
- Markdown cells should auto run when loading a saved Flow notebook (PUBDEV-87)
- Complete About page to include info about the H2O version (PUBDEV-223)
####Enhancements
#####Algorithms
- Flow: Implement model output for GBM (PUBDEV-119)
###0.1.20.1016 - 12/28/14
- Added ip_port field in node json output for Cloud query (github)