Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT-#7139: Use ray-core instead of ray-default #6955

Merged
merged 25 commits into from
May 2, 2024
Merged
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
7465173
TEST-#0000: try to use ray-core instead ray-default
anmyachev Feb 21, 2024
2089b57
try to use ray[client]
anmyachev Feb 22, 2024
243a7dd
explicitly specify grpcio
anmyachev Mar 27, 2024
84a9e1b
check exact ray version: 1.13.0
anmyachev Mar 27, 2024
bd37115
check exact ray version: 2.1.0
anmyachev Mar 27, 2024
61e4174
check exact ray version: 2.2.0
anmyachev Mar 27, 2024
fb14208
use 'pydantic<2' pin
anmyachev Mar 27, 2024
9b60439
check exact ray version: 2.4.0
anmyachev Mar 27, 2024
ba12d22
check exact ray version: 2.6.1
anmyachev Mar 27, 2024
c9e3094
check exact ray version: 2.7.1
anmyachev Mar 27, 2024
bfdc8a4
check exact ray version: 2.8.1
anmyachev Mar 28, 2024
255bdca
check exact ray version: 2.9.1
anmyachev Mar 28, 2024
9cef2f9
check exact ray version: 2.10.0
anmyachev Mar 28, 2024
4887443
Merge branch 'master' of https://github.com/modin-project/modin into …
anmyachev Apr 2, 2024
0abc907
don't install grpcio
anmyachev Apr 30, 2024
c0b34df
fix
anmyachev Apr 30, 2024
a2c48c3
REVERT ME
anmyachev Apr 30, 2024
34074f9
Merge branch 'main' of https://github.com/modin-project/modin into tr…
anmyachev Apr 30, 2024
bb92695
Revert "REVERT ME"
anmyachev Apr 30, 2024
8342321
completly remove 'ClientObjectRef' type
anmyachev Apr 30, 2024
71452a6
add a note about ray's dependencies
anmyachev May 1, 2024
92b9521
fix
anmyachev May 1, 2024
7d1970b
Apply suggestions from code review
anmyachev May 2, 2024
d99f198
add more notes
anmyachev May 2, 2024
da1c0d2
Apply suggestions from code review
YarShev May 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,10 @@ Otherwise, installation of `modin[mpi]` may fail. Refer to
[Installing with pip](https://unidist.readthedocs.io/en/latest/installation.html#installing-with-pip)
section of the unidist documentation for more details about installation.

**Note:** Since Modin 0.30.0 we use a reduced set of Ray dependencies: `ray` instead of `ray[default]`.
This means that the dashboard and cluster launcher are no longer installed by default.
If you need those, consider installing `ray[default]` along with `modin[ray]`.

Modin automatically detects which engine(s) you have installed and uses that for scheduling computation.

#### From conda-forge
Expand All @@ -97,6 +101,10 @@ conda install -c conda-forge modin-mpi # Install Modin dependencies and MPI thro
conda install -c conda-forge modin-hdk # Install Modin dependencies and HDK.
```

**Note:** Since Modin 0.30.0 we use a reduced set of Ray dependencies: `ray-core` instead of `ray-default`.
This means that the dashboard and cluster launcher are no longer installed by default.
If you need those, consider installing `ray-default` along with `modin-ray`.

Refer to
[Installing with conda](https://unidist.readthedocs.io/en/latest/installation.html#installing-with-conda)
section of the unidist documentation for more details on how to install a specific MPI implementation to run on.
Expand Down
8 changes: 8 additions & 0 deletions docs/getting_started/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,10 @@ it is required to have a working MPI implementation installed beforehand.
Otherwise, installation of ``modin[mpi]`` may fail. Refer to
`Installing with pip`_ section of the unidist documentation for more details about installation.

**Note:** Since Modin 0.30.0 we use a reduced set of Ray dependencies: `ray` instead of `ray[default]`.
YarShev marked this conversation as resolved.
Show resolved Hide resolved
This means that the dashboard and cluster launcher are no longer installed by default.
If you need those, consider installing `ray[default]` along with `modin[ray]`.
YarShev marked this conversation as resolved.
Show resolved Hide resolved

Modin will automatically detect which engine you have installed and use that for
scheduling computation! See below for HDK engine installation.

Expand Down Expand Up @@ -129,6 +133,10 @@ it is possible to install modin with chosen engine(s) alongside. Current options
| modin-all | Dask, Ray, Unidist, HDK | Linux |
+---------------------------------+---------------------------+-----------------------------+

**Note:** Since Modin 0.30.0 we use a reduced set of Ray dependencies: `ray-core` instead of `ray-default`.
YarShev marked this conversation as resolved.
Show resolved Hide resolved
This means that the dashboard and cluster launcher are no longer installed by default.
If you need those, consider installing `ray-default` along with `modin-ray`.
YarShev marked this conversation as resolved.
Show resolved Hide resolved

For installing Dask, Ray and MPI through unidist engines into conda environment following command should be used:

.. code-block:: bash
Expand Down
2 changes: 1 addition & 1 deletion docs/requirements-doc.txt
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ recommonmark
sphinx<6.0.0
sphinx-click
# ray==2.5.0 broken: https://github.com/conda-forge/ray-packages-feedstock/issues/100
ray[default]>=2.1.0,!=2.5.0
ray>=2.1.0,!=2.5.0
anmyachev marked this conversation as resolved.
Show resolved Hide resolved
# Override to latest version of modin-spreadsheet
git+https://github.com/modin-project/modin-spreadsheet.git@49ffd89f683f54c311867d602c55443fb11bf2a5
sphinxcontrib_plantuml
Expand Down
2 changes: 1 addition & 1 deletion environment-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ dependencies:

# optional dependencies
# ray==2.5.0 broken: https://github.com/conda-forge/ray-packages-feedstock/issues/100
- ray-default>=2.1.0,!=2.5.0
- ray-core>=2.1.0,!=2.5.0
- pyarrow>=7.0.0
# workaround for https://github.com/conda/conda/issues/11744
- grpcio!=1.45.*
Expand Down
5 changes: 2 additions & 3 deletions modin/core/execution/ray/common/deferred_execution.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,13 +30,12 @@
import pandas
import ray
from ray._private.services import get_node_ip_address
from ray.util.client.common import ClientObjectRef

from modin.config import RayTaskCustomResources
from modin.core.execution.ray.common import MaterializationHook, RayWrapper
from modin.logging import get_logger

ObjectRefType = Union[ray.ObjectRef, ClientObjectRef, None]
ObjectRefType = Union[ray.ObjectRef, None]
ObjectRefOrListType = Union[ObjectRefType, List[ObjectRefType]]
ListOrTuple = (list, tuple)

Expand Down Expand Up @@ -478,7 +477,7 @@ class MetaList:
obj : ray.ObjectID or list
"""

def __init__(self, obj: Union[ray.ObjectID, ClientObjectRef, List]):
def __init__(self, obj: Union[ray.ObjectID, List]):
self._obj = obj

def __getitem__(self, index):
Expand Down
14 changes: 6 additions & 8 deletions modin/core/execution/ray/common/engine_wrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@

import pandas
import ray
from ray.util.client.common import ClientObjectRef

from modin.config import RayTaskCustomResources
from modin.error_message import ErrorMessage
Expand Down Expand Up @@ -129,14 +128,14 @@ def materialize(cls, obj_id):
obj = obj_id.pre_materialize()
return (
obj_id.post_materialize(ray.get(obj))
if isinstance(obj, RayObjectRefTypes)
if isinstance(obj, ray.ObjectRef)
else obj
)

if not isinstance(obj_id, Sequence):
return ray.get(obj_id) if isinstance(obj_id, RayObjectRefTypes) else obj_id
return ray.get(obj_id) if isinstance(obj_id, ray.ObjectRef) else obj_id

if all(isinstance(obj, RayObjectRefTypes) for obj in obj_id):
if all(isinstance(obj, ray.ObjectRef) for obj in obj_id):
return ray.get(obj_id)

ids = {}
Expand All @@ -147,7 +146,7 @@ def materialize(cls, obj_id):
continue
if isinstance(obj, MaterializationHook):
oid = obj.pre_materialize()
if isinstance(oid, RayObjectRefTypes):
if isinstance(oid, ray.ObjectRef):
hook = obj
obj = oid
else:
Expand Down Expand Up @@ -231,7 +230,7 @@ def wait(cls, obj_ids, num_returns=None):
for obj in obj_ids:
if isinstance(obj, MaterializationHook):
obj = obj.pre_materialize()
if isinstance(obj, RayObjectRefTypes):
if isinstance(obj, ray.ObjectRef):
ids.add(obj)

if num_ids := len(ids):
Expand Down Expand Up @@ -332,5 +331,4 @@ def __reduce__(self):
return int, (data,)


RayObjectRefTypes = (ray.ObjectRef, ClientObjectRef)
ObjectRefTypes = (*RayObjectRefTypes, MaterializationHook)
ObjectRefTypes = (ray.ObjectRef, MaterializationHook)
Original file line number Diff line number Diff line change
Expand Up @@ -13,14 +13,11 @@

"""Module houses class that wraps data (block partition) and its metadata."""

from typing import TYPE_CHECKING, Callable, Union
from typing import Callable, Union

import pandas
import ray

if TYPE_CHECKING:
from ray.util.client.common import ClientObjectRef

from modin.config import LazyExecution, RayTaskCustomResources
from modin.core.dataframe.pandas.partitioning.partition import PandasDataframePartition
from modin.core.execution.ray.common import MaterializationHook, RayWrapper
Expand Down Expand Up @@ -60,7 +57,7 @@ class PandasOnRayDataframePartition(PandasDataframePartition):

def __init__(
self,
data: Union[ray.ObjectRef, "ClientObjectRef", DeferredExecution],
data: Union[ray.ObjectRef, DeferredExecution],
length: int = None,
width: int = None,
ip: str = None,
Expand Down Expand Up @@ -328,7 +325,7 @@ def ip(self, materialize=True):
return ip

@property
def _data(self) -> Union[ray.ObjectRef, "ClientObjectRef"]: # noqa: GL08
def _data(self) -> ray.ObjectRef: # noqa: GL08
self.drain_call_queue()
return self._data_ref

Expand Down
2 changes: 1 addition & 1 deletion requirements-dev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ psutil>=5.8.0

## optional dependencies
# ray==2.5.0 broken: https://github.com/conda-forge/ray-packages-feedstock/issues/100
ray[default]>=2.1.0,!=2.5.0
ray>=2.1.0,!=2.5.0
anmyachev marked this conversation as resolved.
Show resolved Hide resolved
pyarrow>=7.0.0
dask[complete]>=2.22.0
distributed>=2.22.0
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

dask_deps = ["dask>=2.22.0", "distributed>=2.22.0"]
# ray==2.5.0 broken: https://github.com/conda-forge/ray-packages-feedstock/issues/100
ray_deps = ["ray[default]>=2.1.0,!=2.5.0", "pyarrow>=7.0.0"]
ray_deps = ["ray>=2.1.0,!=2.5.0", "pyarrow>=7.0.0"]
mpi_deps = ["unidist[mpi]>=0.2.1"]
consortium_standard_deps = ["dataframe-api-compat>=0.2.7"]
spreadsheet_deps = ["modin-spreadsheet>=0.1.0"]
Expand Down
Loading