Expose the flag to disable Ømq copy buffers #595

dmcguire81 · 2020-09-10T22:59:19Z

One of our engineers found an optimization involving disabling ZeroMQ copy buffers in the ProcessWorker, but this is not exposed in the top-level factory methods, make_reader and make_batch_reader. It's useful, and probably should be.

The text was updated successfully, but these errors were encountered:

selitvin · 2020-09-10T23:01:40Z

Yep. Agreed. Can you please elaborate what is the issue?

dmcguire81 · 2020-09-10T23:11:58Z

There's nothing new to report: better throughput, in exchange for less predictable memory consumption, is a good trade-off for us, and probably for others, so it should be available in the factory methods.

dmcguire81 · 2020-09-10T23:14:10Z

While we're talking, though, it seems like the factory methods only work seamlessly if on whatever storage Uber happens to use (guessing HDFS, given libhdfs is the only config given consistently elevated visibility). The experience on S3 is pretty bad.

dmcguire81 · 2020-09-10T23:17:02Z

We're considering dropping down to the petastorm.reader.Reader interface to deal with having to patch s3fs and pyarrow for the respective shortcomings (or this code base to work around them), but wanted to get your take on a refactoring to the factory methods, themselves, since it seems like a losing battle to offer all the flexibility users will need and still all do the instantiation yourself.

selitvin · 2020-09-10T23:18:13Z

Your diagnosis is correct. We did not use petastorm with s3 internally, hence this is not a polished feature. I can help if you can point me to the issues you run into. I did verify basic functionality but not more then that.

selitvin · 2020-09-10T23:20:12Z

I saw that you created #594. If this is not sufficient (curious what else would we need), perhaps we can have make_reader/make_batch_reader take an instance of already constructed filesystem object?

dmcguire81 · 2020-09-10T23:24:34Z

We built up a bit of a backlog while waiting on legal to approve submitting PRs back, but they're all in there, now, so you see everything that's been needed, so far (except for s3fs==0.5.0 throwing TypeError: 'coroutine' object is not iterable from make_batch_reader, which I just found today, by accident, trying to build you a repro case).

I think passing the pyarrow.Filesystem is a great option, since the rest of the code that's hidden (e.g., instances of petastorm.worker_pool.*) is under your direct control, so won't be as much of a maintenance burden to keep "under the hood".

dmcguire81 · 2020-09-10T23:26:32Z

#590 was actually the real killer, but that addresses every problem we've seen, so far, except this one, so we could get it all done in 2 PRs, probably.

dmcguire81 · 2020-09-10T23:28:56Z

For your part, having it be parameterized instead of discovered, makes it so that you can route defects to those project (pyarrow and s3fs) where they belong. Without making s3fs at least an optional extension, it probably shouldn't be referenced in the code, at all, just advertised as "compatible".

selitvin · 2020-09-10T23:36:32Z

So if I understand correctly if we land all PRs you have put up we would solve your outstanding problems, correct?

dmcguire81 · 2020-09-10T23:42:29Z

Yes, but we have a private fork we're working off of, so are not blocked. I'm really interested in the future-proofing idea of adding a filesystem parameter, but wanted to document the evidence for it with illustrative PRs.

If you want to merge these and then discuss a refactoring, great, but I'm assuming you don't want to add parameters to the factory methods and then remove them.

selitvin · 2020-09-10T23:47:54Z

I would image that s3_config_kwargs would solve 90% of user problems (speculating, since did not work with s3 myself).
If this is the case, then it may be a good idea to land it even that we would have a full-blown filesystem object argument later for other usecases (say filesystems that are currently not supported by petastorm).

dmcguire81 · 2020-09-11T00:05:44Z

Makes sense. I'll update that PR with release notes for you to merge.

dmcguire81 mentioned this issue Sep 10, 2020

Add a flag to factory methods to allow zmq copy buffers to be disabled #596

Merged

dmcguire81 closed this as completed Sep 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose the flag to disable Ømq copy buffers #595

Expose the flag to disable Ømq copy buffers #595

dmcguire81 commented Sep 10, 2020

selitvin commented Sep 10, 2020

dmcguire81 commented Sep 10, 2020

dmcguire81 commented Sep 10, 2020

dmcguire81 commented Sep 10, 2020

selitvin commented Sep 10, 2020

selitvin commented Sep 10, 2020

dmcguire81 commented Sep 10, 2020 •

edited

dmcguire81 commented Sep 10, 2020

dmcguire81 commented Sep 10, 2020

selitvin commented Sep 10, 2020

dmcguire81 commented Sep 10, 2020

selitvin commented Sep 10, 2020

dmcguire81 commented Sep 11, 2020

Expose the flag to disable Ømq copy buffers #595

Expose the flag to disable Ømq copy buffers #595

Comments

dmcguire81 commented Sep 10, 2020

selitvin commented Sep 10, 2020

dmcguire81 commented Sep 10, 2020

dmcguire81 commented Sep 10, 2020

dmcguire81 commented Sep 10, 2020

selitvin commented Sep 10, 2020

selitvin commented Sep 10, 2020

dmcguire81 commented Sep 10, 2020 • edited

dmcguire81 commented Sep 10, 2020

dmcguire81 commented Sep 10, 2020

selitvin commented Sep 10, 2020

dmcguire81 commented Sep 10, 2020

selitvin commented Sep 10, 2020

dmcguire81 commented Sep 11, 2020

dmcguire81 commented Sep 10, 2020 •

edited