New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SW-2567] Fix CoxPH docs example for Scala and Python #2531
Conversation
Current example fails for ``` org.apache.spark.sql.AnalysisException: cannot resolve '`label`' given input columns: [start, transplant, stop, age, event, year, id, surgery];; 'Project [start#2305, stop#2306, event#2307, age#2308, year#2309, surgery#2310, transplant#2311, id#2312, 'label] +- Relation[start#2305,stop#2306,event#2307,age#2308,year#2309,surgery#2310,transplant#2311,id#2312] csv ```
Hi @neemah2o, your PR title "Create sw_coxph.rst" is missing JIRA issue number! Please specify it in the following form [SW-] |
It seems we need to get a different dataset with test data, all predictions are NaN:
Testing dataset doesn't have |
Good point @mn-mikke , the H2O 3 version uses heart.csv and splits it
I will modify the PR to match it |
Updated frames with a split of heart.csv to train/test & corrected scala formatting to work in cli/jupyter notebook Note that heart.csv is SW smalldata folder is the same as the heart.csv example as H2O 3 (http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/coxph.html#examples)
@mn-mikke , I made the changes to give a proper test dataset:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you @neemah2o!
* Create sw_coxph.rst Current example fails for ``` org.apache.spark.sql.AnalysisException: cannot resolve '`label`' given input columns: [start, transplant, stop, age, event, year, id, surgery];; 'Project [start#2305, stop#2306, event#2307, age#2308, year#2309, surgery#2310, transplant#2311, id#2312, 'label] +- Relation[start#2305,stop#2306,event#2307,age#2308,year#2309,surgery#2310,transplant#2311,id#2312] csv ``` * Corrected "Isolation Forest" to "CoxPH" * Updated example with split of heart.csv & corrected scala formatting Updated frames with a split of heart.csv to train/test & corrected scala formatting to work in cli/jupyter notebook Note that heart.csv is SW smalldata folder is the same as the heart.csv example as H2O 3 (http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/coxph.html#examples) (cherry picked from commit 7b6ea69)
Fixes JIRA: https://h2oai.atlassian.net/browse/SW-2567
Current example fails for
need to add parameters that are needed for CoxPH