Issues loading BAM files in Google FS #1816
Comments
Similar issue to #1732 in s3 |
Hi @Georgehe4! You need the Java NIO FileSystemProvider for the |
@Georgehe4 if you get this working on your side, would you mind writing it up for https://github.com/bigdatagenomics/adam/blob/65dde41b50fee29bce8f8941ca6323fba840f3eb/docs/source/40_deploying_ADAM.md#input-and-output-data-on-hdfs-and-s3 in the docs? |
Haven't looked closely at this but step 1 is to include the NIO provider, as Frank said above. Here's a piece of documentation about that; you most likely want the "shaded" JAR, to avoid dep-version-conflicts with Spark or other things on your classpath. The latest version seems to be 0.30.0-alpha). At that point, you may find that the NIO provider is still not being found; Scala does something to its classloaders that breaks custom-NIO-provider-detection (cf. scala/bug#10247). I've dealt with that by using my own Hope that helps! |
The shaded jar seems to get the program closer to integrating with gs but there are some auth issues that need to be resolved first |
Using the 0.22.0-alpha version of the shaded jar seems to work. There are similar issues trying to pull from gs using gce vm's being tracked here: googleapis/google-cloud-java#2453 |
Hi @Georgehe4! I know you've been working on this for bigdatagenomics/mango#340. When you're done downstream, would you mind pushing some of that info back upstream? |
Yep for sure |
Got some time unblocked this week, I'll be working on a PR. |
Resolved by #1918. |
There are issues in ADAM when trying to load files from google file system:
For comparison, loading .vcf from google file system does not have issues.
The text was updated successfully, but these errors were encountered: