Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sample does not work for Mesos Cluster #23

Closed
yanglei99 opened this issue Dec 3, 2015 · 6 comments
Closed

Sample does not work for Mesos Cluster #23

yanglei99 opened this issue Dec 3, 2015 · 6 comments

Comments

@yanglei99
Copy link

I can run sample using spark-submit against a --master=local[2]. However when I target it to my mesos cluster, I got NPE: by class water.parser.ParseSetup$GuessSetupTsk; class java.lang.NullPointerException: null
at water.parser.ParseSetup$GuessSetupTsk.map(ParseSetup.java:269)
at water.MRTask.compute2(MRTask.java:624)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1017)

If the issue is related to how the sample is loading the data, I wonder if we should be creating some sample that will work with remote clusters, e.g. hosting the data on S3..

Thank you .

Yang.

@mmalohlava
Copy link
Member

Hi Yang,

if you would like to point to some data store on HDFS/S3, please point URI to them:

val hf = new H2OFrame(new java.net.URI("hdfs//mynamenode/mydirectory/myfile.csv"))

The same for S3.

If you would like to parse a local file, it has to be distributed to each node in the cluster:
val hf = new H2OFrame(new java.io.File("/my/cluster/datastore/file.csv"))

Right now, we our API does not provide a shortcut for upload of local file (open issue:
https://0xdata.atlassian.net/browse/SW-56).

Thank you
Michal

On 12/3/15 11:59 AM, Yang Lei wrote:

I can run sample using spark-submit against a --master=local[2]. However when I target it to my
mesos cluster, I got NPE: by class water.parser.ParseSetup$GuessSetupTsk; class
java.lang.NullPointerException: null
at water.parser.ParseSetup$GuessSetupTsk.map(ParseSetup.java:269)
at water.MRTask.compute2(MRTask.java:624)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1017)

If the issue is related to how the sample is loading the data, I wonder if we should be creating
some sample that will work with remote clusters, e.g. hosting the data on S3..

Thank you .

Yang.


Reply to this email directly or view it on GitHub
#23.

@yanglei99
Copy link
Author

Thanks Michal,

So basically we are saying the samples only work for local mode. That is the reason I asked if we should host the sample data somewhere like s3 and so it can run out of box.

I will lose the issue now. Thank you.

Yang

@yanglei99 yanglei99 reopened this Dec 4, 2015
@yanglei99
Copy link
Author

Another thought is if the sample can read the "SPARKLING_WATER_HOME" to construct the full path of where the file is. So that as long as the target slaves also having the Sparking Water installed, it will be able to load the file.

Thanks. Yang.

@yanglei99
Copy link
Author

verified the sample works after changing the file location to be downloadable.

@crystalfuns
Copy link

How to connect sparklingwater to DCOS Mesos Spark Cluster ?

@mmalohlava
Copy link
Member

Right now, we do not provide any explicit support for DC/OS. However, any feedback, recommendations, or requirements are welcomed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants