New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkpoint to s3 not working #175

Closed
mavencode01 opened this Issue Feb 24, 2017 · 4 comments

Comments

Projects
None yet
2 participants
@mavencode01

mavencode01 commented Feb 24, 2017

I'm trying to deploy my job to aws EMR and using s3 as my checkpoint dir but I'm getting an exception using s3.

java.lang.IllegalArgumentException: Wrong FS: s3://spark-jobs/supercluster-checkpoint/f9d90be3-bb3e-4a7f-8009-37bb4d697f67/connected-components-e7bf178a/2, expected: hdfs://ip-172-18-13-6.ec2.internal:8020
	at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:652)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:194)
	at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:106)
	at org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:707)
	at org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:703)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:703)
	at org.graphframes.lib.ConnectedComponents$.org$graphframes$lib$ConnectedComponents$$run(ConnectedComponents.scala:340)
	at org.graphframes.lib.ConnectedComponents.run(ConnectedComponents.scala:139)
	at com.sonatype.aname.sparkjob.SuperClusterV2$.main(SuperClusterV2.scala:87)
	at com.sonatype.aname.sparkjob.SuperClusterV2.main(SuperClusterV2.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)

Is it possible to use s3 as Checkpoint directory ?

@felixcheung

This comment has been minimized.

Show comment
Hide comment
@felixcheung

felixcheung Feb 24, 2017

Member

It should work with this commit 5cfc027

Would you be able to run against the latest source of GraphFrame?

Member

felixcheung commented Feb 24, 2017

It should work with this commit 5cfc027

Would you be able to run against the latest source of GraphFrame?

@mavencode01

This comment has been minimized.

Show comment
Hide comment
@mavencode01

mavencode01 commented Feb 24, 2017

Is that published yet to the spark package ? @felixcheung

https://spark-packages.org/package/graphframes/graphframes

@felixcheung

This comment has been minimized.

Show comment
Hide comment
@felixcheung

felixcheung Feb 24, 2017

Member
Member

felixcheung commented Feb 24, 2017

@mavencode01

This comment has been minimized.

Show comment
Hide comment
@mavencode01

mavencode01 Feb 27, 2017

Thanks @felixcheung I was able to run against the latest source and it worked.

mavencode01 commented Feb 27, 2017

Thanks @felixcheung I was able to run against the latest source and it worked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment