Fix ProxyStore serialization issue with Ray #62
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
I couldn't verify that proxies were getting resolved multiple times in
TaPS
, but I did add some tests to verify that proxies are only resolved when we expect them to.I also found the source of the Ray serialization issues with ProxyStore was limited to
Proxy[bytes]
instances. Ray skips serializingbytes
instances, andProxy[bytes]
is technically abytes
instance so Ray wasn't serializing the proxy and then crashing because the proxy wasn't actually a bytestring. Since this was limited toProxy[bytes]
which are only used in the synthetic app, I just altered the synthetic app with some data indirection and added a docstring note to theRayExecutor
.Fixes
Type of Change
Testing
Update unit tests and testing the proxy transformer with Ray.
I also tested ProxyStore performance with Dask and got the following.
Baseline
ProxyStore
As expected, using ProxyStore is much faster (more than 2x here) which is reasonably in line with what we found in the paper.
Pull Request Checklist
Please confirm the PR meets the following requirements.
pre-commit
(e.g., ruff, mypy, etc.).