-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core] Non Unit Instance fractional value fix #39293
Conversation
a7c6b77
to
9fb8390
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@@ -194,3 +194,6 @@ The precision of the fractional resource requirement is 0.0001 so you should avo | |||
|
|||
Besides resource requirements, you can also specify an environment for a task or actor to run in, | |||
which can include Python packages, local files, environment variables, and more---see :ref:`Runtime Environments <runtime-environments>` for details. | |||
|
|||
.. note:: | |||
Fractional input must be between 0 to 1. For any value greater than 1, `num_cpus` and `num_gpus` requires a whole number. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fractional input must be between 0 to 1. For any value greater than 1, `num_cpus` and `num_gpus` requires a whole number. | |
Fractional resource requirement must be smaller than 1. For any resource requirement greater than 1, it needs to be a whole number (e.g. ``num_cpus=1.5`` is invalid). |
Can you move it to be above tip?
9fb8390
to
d3c2d9c
Compare
Signed-off-by: Jonathan Nitisastro <jonathannitisastro@jonathancn-QC69NQYVVG.local.meter>
7232c66
to
b73674b
Compare
Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stamp for docs
@@ -194,3 +194,6 @@ The precision of the fractional resource requirement is 0.0001 so you should avo | |||
|
|||
Besides resource requirements, you can also specify an environment for a task or actor to run in, | |||
which can include Python packages, local files, environment variables, and more---see :ref:`Runtime Environments <runtime-environments>` for details. | |||
|
|||
.. note:: | |||
For any unit resource (GPU, TPU, Neuron Core) requirement greater than 1, it needs to be a whole number (e.g. ``num_gpus=1.5`` is invalid). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add more details about what 's the definition of unit resource?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is the internal terminology
python/ray/tests/test_advanced_2.py
Outdated
|
||
@ray.remote(num_cpus=1.5) | ||
def test(): | ||
test_frac_cpu.remote() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_frac_cpu.remote() | |
assert ray.get(test_frac_cpu.remote()) |
and make test_frac_cpu return True (it is better having explicit assert)
Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
18927a6
to
f90109e
Compare
lint failure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope the rewrite still makes sense. Let me know if you have questions.
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: jonathan-anyscale <144177685+jonathan-anyscale@users.noreply.github.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: jonathan-anyscale <144177685+jonathan-anyscale@users.noreply.github.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: jonathan-anyscale <144177685+jonathan-anyscale@users.noreply.github.com>
merge conflict |
Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
which can include Python packages, local files, environment variables, and more---see :ref:`Runtime Environments <runtime-environments>` for details. | ||
which can include Python packages, local files, environment variables, and more. See :ref:`Runtime Environments <runtime-environments>` for details. | ||
|
||
.. note:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move it to be above tip
?
which can include Python packages, local files, environment variables, and more. See :ref:`Runtime Environments <runtime-environments>` for details. | ||
|
||
.. note:: | ||
Unit resource requirements that are greater than 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
People usually don't know what unit resources are. We should just list them out like
GPU, TPU, neuron_cores resource requirements that are greater than 1, need to be whole numbers. For example, ``num_gpus=1.5`` is invalid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the GPU, TPU and neuron core are written on the next line
python/ray/_raylet.pyx
Outdated
unit_resources = f"{RayConfig.instance().predefined_unit_instance_resources()},\ | ||
{RayConfig.instance().custom_unit_instance_resources()}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets split by (,) and convert them to a set and then check. Otherwise, we may mismatch: e.g. we have a unit resource called Foo_Bar and you will think Foo is also unit resource.
python/ray/_raylet.pyx
Outdated
"Unit instance resource (GPU, TPU, Neuron Core) quantities >1 must", | ||
f" be whole numbers. {key} resource with value {value} is invalid.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Unit instance resource (GPU, TPU, Neuron Core) quantities >1 must", | |
f" be whole numbers. {key} resource with value {value} is invalid.") | |
f"{key} resource quantities >1 must", | |
f" be whole numbers. The specified quantity {value} is invalid.") |
Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com> Signed-off-by: jonathan-anyscale <144177685+jonathan-anyscale@users.noreply.github.com>
There's a bug where Ray should've allow non unit instance resource to have fractional value greater than 1 (unit instance resource currently: GPU, TPU, Neuron Core). This PR is to fix that Signed-off-by: Jonathan Nitisastro <jonathannitisastro@jonathancn-QC69NQYVVG.local.meter> Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com> Signed-off-by: jonathan-anyscale <144177685+jonathan-anyscale@users.noreply.github.com> Co-authored-by: Jonathan Nitisastro <jonathannitisastro@jonathancn-QC69NQYVVG.local.meter> Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com> Signed-off-by: Victor <vctr.y.m@example.com>
Why are these changes needed?
There's a bug where Ray should've allow non unit instance resource to have fractional value greater than 1 (unit instance resource currently: GPU, TPU, Neuron Core). This PR is to fix that
Related issue number
Closes #37241
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.