Ray data migration #1260

ShreyaR · 2021-08-17T18:37:52Z

Code Pull Requests

This PR adds support for Ray Datasets for distributed prediction.

Documentation Pull Requests

Note that the documentation HTML files are in docs/ while the Markdown sources are in mkdocs/docs.

If you are proposing a modification to the documentation you should change only the Markdown files.

api.md is automatically generated from the docstrings in the code, so if you want to change something in that file, first modify ludwig/api.py docstring, then run mkdocs/code_docs_autogen.py, which will create mkdocs/docs/api.md .

tgaddair · 2021-08-17T19:07:15Z

ludwig/backend/ray.py

@@ -176,16 +177,20 @@ def batch_predict(self, model, dataset, *args, **kwargs):
        predictor_kwargs = self.predictor_kwargs
        output_columns = get_output_columns(model.output_features)

-        def batch_predict_partition(dataset):
+        def batch_predict_partition(df):


This will be more efficient if we turn it into a (stateful) actor:

class BatchInferModel: def __init__(self): self.model = remote_model.load() self.predictor = Predictor(**predictor_kwargs) def __call__(self, df: pd.DataFrame) -> pd.DataFrame: pd_ds = PandasDataset(df, dataset.features, dataset.data_hdf5_fp) predictions = self.predictor.batch_predict(self.model, pd_ds, *args, **kwargs) ordered_predictions = predictions[output_columns] return ordered_predictions return dataset.ds.map_batchest(BatchInferModel, ...)

This allows us to cache the model and predictor in memory between calls.

Thanks for the suggestion, done!

ludwig/data/dataframe/dask.py

tgaddair · 2021-08-17T19:12:56Z

ludwig/backend/ray.py

-            meta=[(c, 'object') for c in output_columns]
+
+        num_gpus = int(ray.cluster_resources().get('GPU', 0) > 0)
+        return dataset.ds.map_batches(


We may need to do a conversion to Dask DataFrame here:

dataset.ds.map_batches(...).to_dask()

The other option is to carry through the RayDataset and then only call the methods to save to a particular format in the end. This might be a better longterm solution, but might be too involved for this PR, since the code already assumes Dask DF will be used downstream. What do you think?

Yeah I agree, I really like the idea of carrying Ray Datasets throughout the prediction and I can do that in a new PR.

tgaddair · 2021-08-17T22:07:14Z

ludwig/backend/ray.py

@@ -163,6 +166,22 @@ def shutdown(self):
        self.executor.shutdown()


+class BatchInferModel:
+    def __init__(self, remote_model, predictor_kwargs, output_columns, features, data_hdf5_fp, *args, **kwargs):
+        self.model = remote_model.load()


We need to define this class in-line because we don't want to load the remote model until we're running in the worker context.

tgaddair · 2021-08-17T22:16:43Z

ludwig/data/dataset/tfrecord.py

@@ -190,6 +190,7 @@ def create(self, dataset, config, training_set_metadata):
        )

    def create_inference_dataset(self, dataset, tag, config, training_set_metadata):
+        # TODO(shreya): Confirm if this needs to be updated to RayDatasets too.


Probably we want to remove PartitionedDataset here and just raise an exception like:

raise ValueError('Batch inference not supported with TFRecord format at this time')

tgaddair · 2021-08-17T22:18:44Z

ludwig/api.py

@@ -872,6 +872,7 @@ def evaluate(
            # calculate the overall metrics
            if collect_overall_stats:
                # TODO ray: support calculating stats on partitioned datasets
+                # TODO(shreya): Confirm what's needed to enable this.


Change isinstance check to RayDataset

tgaddair

Nice!

ludwig/features/base_feature.py

ludwig/models/predictor.py

tgaddair

Nice work figuring out the flatten stuff. Just a couple small things, then we should be good to land.

ludwig/backend/dask.py

ludwig/backend/ray.py

ludwig/features/base_feature.py

tgaddair

Last thing we need to double check is the optional dependency imports. I added a few comments about how we can workaround that. Let me know if it makes sense.

ludwig/features/base_feature.py

ludwig/features/sequence_feature.py

ludwig/features/text_feature.py

ludwig/data/dataset/partitioned.py

tgaddair

LGTM! Awesome work.

ShreyaR requested a review from tgaddair August 17, 2021 18:48

tgaddair reviewed Aug 17, 2021

View reviewed changes

ShreyaR marked this pull request as ready for review August 17, 2021 21:49

tgaddair reviewed Aug 17, 2021

View reviewed changes

ShreyaR force-pushed the ray-data-migration branch from cbe12bd to 06772e3 Compare August 24, 2021 04:10

ShreyaR added 13 commits August 24, 2021 14:39

Migrating to Ray Datasets

cde6572

Uncommented changes for ray data migration

a4d68db

Functional model prediction

b9bb104

Removing extra changes

a4e67bb

Fixing imports

eaa98dc

Removed temp changes

1824636

Converting back from ray dataset to dask df

80ae727

Made changes acc to comments

8eb387f

Completely removed partitioned datasets

94a3631

Updated ray version and removed todos

dce43a1

Updated dask backend inference

6c926f7

Fixed ray imports in dask.py

f66c2cf

Fixed RayDatasets map_batches args

33fe065

ShreyaR force-pushed the ray-data-migration branch from 41a2b03 to 33fe065 Compare August 24, 2021 21:39

ShreyaR added 6 commits August 24, 2021 21:18

Debugging mem leak distributed tests 1

fd4a488

Debugging mem leak distributed tests 2

a0402e3

Uncommented other tests

96cdfff

Added feature wise flattening and unflattening

3f95686

Passing ray tests

8c0c3a1

Uncommented all tests

88a3266

tgaddair reviewed Aug 27, 2021

View reviewed changes

ludwig/features/base_feature.py Show resolved Hide resolved

ludwig/models/predictor.py Outdated Show resolved Hide resolved

ShreyaR added 2 commits August 29, 2021 17:21

Only perform flattening if running in distributed

a141227

Only perform flattening if running in distributed v2

e8cc03b

ShreyaR added 2 commits August 30, 2021 13:42

Disabling batch prediction for dask backend

9007536

Removing extra imports

2f34283

ShreyaR self-assigned this Aug 30, 2021

tgaddair reviewed Aug 31, 2021

View reviewed changes

ShreyaR added 3 commits August 31, 2021 01:08

Finishing touches, fixing imports

b78155a

Refactor batch infer model

d70dec2

Adding self to get_batch_infer_model

e4674b8

tgaddair reviewed Sep 1, 2021

View reviewed changes

ludwig/features/base_feature.py Outdated Show resolved Hide resolved

ludwig/features/sequence_feature.py Outdated Show resolved Hide resolved

ludwig/features/text_feature.py Outdated Show resolved Hide resolved

ludwig/data/dataset/partitioned.py Outdated Show resolved Hide resolved

Added import guards

c53eb6f

tgaddair approved these changes Sep 2, 2021

View reviewed changes

tgaddair merged commit 6adeed2 into master Sep 2, 2021

tgaddair deleted the ray-data-migration branch September 2, 2021 14:04

ShreyaR added a commit that referenced this pull request Sep 18, 2021

Ray data migration (#1260)

39e8548

tgaddair mentioned this pull request Oct 6, 2021

[ray] Replace flatten/unflatten ops for Ray Datasets with tensors #1353

Open

justinxzhao mentioned this pull request Mar 29, 2023

Remove the flatten/unflatten ray backend tensor communication workaround #3301

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ray data migration #1260

Ray data migration #1260

ShreyaR commented Aug 17, 2021

tgaddair Aug 17, 2021

ShreyaR Aug 17, 2021

tgaddair Aug 17, 2021

ShreyaR Aug 17, 2021

tgaddair Aug 17, 2021

tgaddair Aug 17, 2021

tgaddair Aug 17, 2021

tgaddair left a comment

tgaddair left a comment

tgaddair left a comment

tgaddair left a comment

Ray data migration #1260

Ray data migration #1260

Conversation

ShreyaR commented Aug 17, 2021

Code Pull Requests

Documentation Pull Requests

tgaddair Aug 17, 2021

Choose a reason for hiding this comment

ShreyaR Aug 17, 2021

Choose a reason for hiding this comment

tgaddair Aug 17, 2021

Choose a reason for hiding this comment

ShreyaR Aug 17, 2021

Choose a reason for hiding this comment

tgaddair Aug 17, 2021

Choose a reason for hiding this comment

tgaddair Aug 17, 2021

Choose a reason for hiding this comment

tgaddair Aug 17, 2021

Choose a reason for hiding this comment

tgaddair left a comment

Choose a reason for hiding this comment

tgaddair left a comment

Choose a reason for hiding this comment

tgaddair left a comment

Choose a reason for hiding this comment

tgaddair left a comment

Choose a reason for hiding this comment