It's kind of a pain to have to single out empty files (basically add the line read_f = read_f[file.info(read_f)$size > 0]) when the vast majority of the time this operation works as intended (since it's rare for spark to output empty files) -- is there any reason fread can't just warn for such a file and skip?
The text was updated successfully, but these errors were encountered:
Not sure I follow -- are you suggesting I file an issue with them? This didn't come from any particular issue.
As I understand it, the root cause here is I'm reading from a parquet directory that has a bunch of empty constituent files. So when I coalesce there's still a partition with no actual data in it. Not sure how to overcome this, so I'm stuck with empty files for now. (that is to say, I don't think this is a bug on Spark's part, per se)