ARROW-3777: [C++] Add Slow input streams and slow filesystem#5439
ARROW-3777: [C++] Add Slow input streams and slow filesystem#5439pitrou wants to merge 2 commits intoapache:masterfrom
Conversation
There was a problem hiding this comment.
This is unrelated to this PR, but I finally got annoyed enough by the slow format-check that I spent some time figuring out what happened. The bottom line is that each call to _check_one_file in a child process was serializing and transmitting the entire results for all files. Instead, we just transmit the result for the single file being checked, it's much faster.
bcfd933 to
b46487d
Compare
b46487d to
2ca64c5
Compare
Codecov Report
@@ Coverage Diff @@
## master #5439 +/- ##
==========================================
- Coverage 89.69% 89.69% -0.01%
==========================================
Files 810 710 -100
Lines 114853 108385 -6468
Branches 1498 0 -1498
==========================================
- Hits 103021 97219 -5802
+ Misses 11821 11166 -655
+ Partials 11 0 -11
Continue to review full report at Codecov.
|
wesm
left a comment
There was a problem hiding this comment.
Thanks for doing this. Minor comments but can be merged whenever. I think we should opt for reducing the size of header files (and hiding more implementation details) when possible
|
|
||
| /// \brief EXPERIMENTAL: a FileSystem implementation that delegates to another | ||
| /// implementation but inserts latencies at various points. | ||
| class ARROW_EXPORT SlowFileSystem : public FileSystem { |
There was a problem hiding this comment.
I've brought this topic up before, but what do you think about putting the entire implementation of SlowFS in a .cc file and returning std::shared_ptr from the function that creates it? This can always be done later so refactor need not happen now. It's probably better to use factory methods for instantiating most FS classes anyway
Unless you anticipate adding additional methods to the class
There was a problem hiding this comment.
Same comments hold regarding the slow stream classes
There was a problem hiding this comment.
That's a good question. I don't think our conventions should vary too much. If some classes are exposed and other hidden it feels a bit weird. But strong opinion from me.
There was a problem hiding this comment.
My preference would be to hide the implementations in all cases except where we expose additional methods. If only that it offers more freedom to refactor
There was a problem hiding this comment.
Ok, we discuss this in a separate JIRA. Will merge.
No description provided.