Add S3 based Parquet directory loader #463

Closed
fnothaft opened this Issue Nov 5, 2014 · 4 comments

Comments

Projects
None yet
3 participants
@fnothaft
Member

fnothaft commented Nov 5, 2014

Tagging @tdanford @carlyeks

The current AvroParquetRDD loads just a single partition of a Parquet file from S3. If you have an indexed RDD, you can load a whole directory. We should add code to AvroParquetRDD so you can load a whole directory at once.

@tdanford

This comment has been minimized.

Show comment
Hide comment
@tdanford

tdanford Nov 5, 2014

Contributor

@carlyeks want to work on this today, when we meet up at 2pm?

Contributor

tdanford commented Nov 5, 2014

@carlyeks want to work on this today, when we meet up at 2pm?

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Jul 10, 2015

Member

Can this issue be closed? I lost the thread in the jump to the utils project, was the relevant code in bigdatagenomics/utils#10?

Member

heuermh commented Jul 10, 2015

Can this issue be closed? I lost the thread in the jump to the utils project, was the relevant code in bigdatagenomics/utils#10?

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Jul 10, 2015

Member

Yes, I think so. @tdanford are you OK with deleting the topic branch for this work?

Member

fnothaft commented Jul 10, 2015

Yes, I think so. @tdanford are you OK with deleting the topic branch for this work?

@fnothaft fnothaft added the wontfix label Jul 6, 2016

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Jul 6, 2016

Member

Closing as won't fix.

Member

fnothaft commented Jul 6, 2016

Closing as won't fix.

@fnothaft fnothaft closed this Jul 6, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment