-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong FS on loading json with spark from s3 #114
Comments
Thanks ! Would you like to contribute the fix? |
@harsha2010 I don't really know how to fix it, the part with split path is from someone on the AWS forum. |
harsha2010
pushed a commit
that referenced
this issue
Jun 10, 2017
harsha2010
added a commit
that referenced
this issue
Jun 13, 2017
Fixes issue #114 by correctly initializing the FileSystem from the …
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When trying to load json file from s3 with magellan on AWS EMR cluster:
val polygons = spark.read.format("magellan").option("type", "geojson").load(inJson)
you get:
WholeFileReader is getting the default FileSystem. However, it should use the split path to determine the appropriate FileSystem, and thus it does not get the FileSystem for S3 (EmrFS by default), but it gets the one for HDFS (which is the default FileSystem on EMR clusters).
magellan/src/main/scala/magellan/mapreduce/WholeFileReader.scala
Line 42 in 3d282cd
The text was updated successfully, but these errors were encountered: