Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PR - for FfDL Integration With H2O.ai #88

Merged
merged 17 commits into from
Jun 5, 2018
Merged

Conversation

nkpng2k
Copy link
Contributor

@nkpng2k nkpng2k commented Jun 4, 2018

Developer's Certificate of Origin 1.1

   By making a contribution to this project, I certify that:

   (a) The contribution was created in whole or in part by me and I
       have the right to submit it under the Apache License 2.0; or

   (b) The contribution is based upon previous work that, to the best
       of my knowledge, is covered under an appropriate open source
       license and I have the right under that license to submit that
       work with modifications, whether created in whole or in part
       by me, under the same open source license (unless I am
       permitted to submit under a different license), as indicated
       in the file; or

   (c) The contribution was provided directly to me by some other
       person who certified (a), (b) or (c) and I have not modified
       it.

   (d) I understand and agree that this project and the contribution
       are public and that a record of the contribution (including all
       personal information I submit with it, including my sign-off) is
       maintained indefinitely and may be redistributed consistent with
       this project or the open source license(s) involved.

Hello Tommy and Animesh,
Dropping the PR for H2O + FfDL integration. Feel free to add any comments/changes necessary and I will make them happen.
Thanks,
Nicholas

Copy link
Contributor

@Tomcli Tomcli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added some small comments. I will do a full review after I ran some tests with this PR.

Makefile Outdated
@@ -104,7 +103,7 @@ minikube: ## Configure Minikube (local Kubernetes)
@which minikube > /dev/null || (echo Please install Minikube; exit 1)
@minikube ip > /dev/null 2>&1 || ( \
echo "Starting up Minikube"; \
minikube start --insecure-registry 9.0.0.0/8 --insecure-registry 10.0.0.0/8 --cpus $(MINIKUBE_CPUS) --memory $(MINIKUBE_RAM) --vm-driver=$(MINIKUBE_DRIVER) > /dev/null; \
minikube start --insecure-registry 9.0.0.0/8 --insecure-registry 10.0.0.0/8 --cpus $(MINIKUBE_CPUS) --memory $(MINIKUBE_RAM) > /dev/null; \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to remove the --vm-driver flag?

version: "latest"
command: >
./h2o3-cluster.sh;
if [ "$LEARNER_ID" == "1" ]; then python h2o3_baseline.py --trainDataFile ${DATA_DIR}/higgs_train_10k.csv --target response --memory 1; fi;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In community/FfDL-H2Oai/README.md, can you add some instructions on how to obtain the higgs_train_10k.csv dataset?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, see Examples in readme

@nkpng2k
Copy link
Contributor Author

nkpng2k commented Jun 4, 2018

@Tomcli

The Makefile changes were an accident, didn't realize I had pushed those. I was experimenting. Good catch.

Also, edited the Readme to contain the s3 links to higgs dataset.

Copy link
Contributor

@Tomcli Tomcli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nkpng2k, overall this PR looks very good. The only thing we need to change is the default resources for this example because I was having some memory inefficient errors when running with only 1Gb of memory per learner. After that we can pull in this PR. 👍

learners: 2
gpus: 0
cpus: 0.5
memory: 1Gb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After running few tests with this example, it's better to run with 1 cpu and 2Gb of memory.

@Tomcli Tomcli mentioned this pull request Jun 4, 2018
@nkpng2k
Copy link
Contributor Author

nkpng2k commented Jun 5, 2018

@Tomcli

Hi Tommy,
Minor change to defaults of manifest-h2o.yml, also added change to the README.md making a suggestion as to the memory allocation required to run H2O.

@Tomcli
Copy link
Contributor

Tomcli commented Jun 5, 2018

Thanks @nkpng2k, Looks very good to me. 👍

@animeshsingh
Copy link

LGTM

@animeshsingh animeshsingh merged commit 788aa30 into IBM:master Jun 5, 2018
This was referenced Jun 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants