ValueError: After taking into account object store and redis memory usage, the amount of memory on this node available for tasks and actors (-0.01 GB) is less than 0% of total. You can adjust these settings with ray.init(memory=<bytes>, object store memory=<bytes> #11966

LianShuaiLong · 2020-11-12T09:49:04Z

What is the problem?

when i run ray in ML platform,

ValueError: After taking into account object store and redis memory usage, the amount of memory on this node available for tasks and actors (-0.01 GB) is less than 0% of total. You can adjust these settings with ray.init(memory=<bytes>, object store memory=<bytes>

occurs
can you tell me the approximate value of memory size /object store memory i should set ?
thanks

Ray version and other system information (Python version, TensorFlow version, OS):

Reproduction (REQUIRED)

Please provide a script that can be run to reproduce the issue. The script should have no external library dependencies (i.e., use fake or mock data / environments):

If we cannot run your script, we cannot fix your issue.

I have verified my script runs in a clean environment and reproduces the issue.
I have verified the issue also occurs with the latest wheels.

rkooo567 · 2020-11-12T18:17:32Z

This is a great question! Usually these memory should be set automatically by ray. Did you set artificial values when you run ray.init?

rkooo567 · 2020-11-12T18:19:15Z

Also cc @richardliaw

LianShuaiLong · 2020-11-13T01:31:47Z

This is a great question! Usually these memory should be set automatically by ray. Did you set artificial values when you run ray.init?

i set num_cpus=16 only

rkooo567 · 2020-11-13T05:05:47Z

@ericl Do you know what's the recommended setup?

Also, what's your total memory @LianShuaiLong?

LianShuaiLong · 2020-11-13T05:26:31Z

@ericl Do you know what's the recommended setup?

Also, what's your total memory @LianShuaiLong?

I run it in Machine Learning Platform, I set 16 CPUS , 8GPUS, 48GB memory for all my training trials

LianShuaiLong · 2020-11-13T05:46:20Z

@ericl Do you know what's the recommended setup?
Also, what's your total memory @LianShuaiLong?

I run it in Machine Learning Platform, I set 16 CPUS , 8GPUS, 48GB memory for all my training trials

i find that ray can start successfully when i set no params for ray.init(),why?

rkooo567 · 2020-11-14T03:31:04Z

Hmm, so are you saying

ray.init() # works
ray.init(num_cpus) # doesn't work

?

LianShuaiLong · 2020-11-15T02:53:32Z

Hmm, so are you saying

ray.init() # works
ray.init(num_cpus) # doesn't work

?

yeap，it failed when i set num_cpus/_temp_dir/_memory/
by the way i run my experiment in Machine Learning Platform, and it works well on my own computer

LianShuaiLong · 2020-12-02T03:35:29Z

since i get this error again, i reopen this issue

rahulmadanraju · 2021-11-11T08:49:09Z

Is there a way to get out of this error? I have upgraded the ray to 1.8 as well.. but still shows up with this issue.

On trying for both ray.init() and ray.init(num_cpus) the error remains.

rkooo567 · 2021-11-11T10:36:07Z

What about if you do num_cpus=4 or sth? (Provide a kwarg instead of arg)

rahulmadanraju · 2021-11-11T10:49:59Z

It's still the same. Initially had the same setup.
The initialization of the ray is on the jupyterlab, which often sets the jupyter kernel to restart the moment it enters the function.remote()

On debugging, it showed with the mentioned error.
ValueError: After taking into account object store and redis memory usage, the amount of memory on this node available for tasks and actors (-0.01 GB) is less than 0% of total. You can adjust these settings with ray.init(memory=, object store memory=)

did an initial checkup with the node and resources. They are as follows:
ray.available_resources()

{'node:172.18.0.24': 1.0,
 'CPU': 4.0,
 'memory': 3957492942.0,
 'object_store_memory': 1978746470.0}

ray.nodes()

[{'NodeID': 'c17c3cfcfee9490830d2e8e1b49ed0c3f021bb7b6a6ac9fd7c0d5aa9',
  'Alive': True,
  'NodeManagerAddress': '172.18.0.24',
  'NodeManagerHostname': '6f7ac0772b5f',
  'NodeManagerPort': 45627,
  'ObjectManagerPort': 36912,
  'ObjectStoreSocketName': '/tmp/ray/session_2021-11-11_10-52-33_830664_27560/sockets/plasma_store',
  'RayletSocketName': '/tmp/ray/session_2021-11-11_10-52-33_830664_27560/sockets/raylet',
  'MetricsExportPort': 58984,
  'alive': True,
  'Resources': {'memory': 3957492942.0,
   'node:172.18.0.24': 1.0,
   'object_store_memory': 1978746470.0,
   'CPU': 4.0}}]

rkooo567 · 2021-11-11T10:51:34Z

Hmm can you also tell me the memory size of the machine/container you runs your jupyter on?

rahulmadanraju · 2021-11-11T11:16:10Z

orcahmlee · 2022-04-08T04:19:43Z

This is a great question! Usually these memory should be set automatically by ray. Did you set artificial values when you run ray.init?

Hi @rkooo567,
Could you provide more details about how Ray set the memory automatically?
Does any documentation that describes it, or any other reference I can check?

scottsun94 · 2022-10-17T22:15:27Z

@rkooo567 Bump this.
cc: @jjyao on the documentation feedback.

chongxiaoc · 2022-10-20T23:22:48Z

Any documentation about correctly initializing Ray inside a docker container? Hitting same issue here.

scottsun94 · 2022-10-21T00:14:13Z

cc: @DmitriGekhtman

chongxiaoc · 2022-10-21T03:47:21Z

Any documentation about correctly initializing Ray inside a docker container? Hitting same issue here.

After upgrading to Ray 2.0, issue is gone on my side.

oscartackstrom · 2022-11-10T13:56:04Z

Same issue for me with ray 2.0.0. When calling ray.init() in a jupyter notebook from vscode. Sometimes, I instead get an error like Attempting to cap object store memory usage at 58042368 bytes, but the minimum allowed is 78643200 bytes.

rkooo567 · 2022-11-17T05:30:01Z

When you don't specify the object store memory, it uses 20% of available memory. I think your machine doesn't have enough available memory (20% of available memory is even less than 80MB).

You can manually specify object_store_memory to avoid this.

ray.init(object_store_memory=<bytes>)

the minimal you should specify is 78643200.

Better solution is to use an instance that has more available memory.

LianShuaiLong added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Nov 12, 2020

rkooo567 added the fix-error-msg This issue has a bad error message that should be improved. label Nov 13, 2020

rkooo567 added this to the Better Error Messages milestone Nov 13, 2020

LianShuaiLong closed this as completed Nov 26, 2020

LianShuaiLong reopened this Dec 2, 2020

bveeramani added docs An issue or change related to documentation and removed fix-docs labels May 24, 2022

p4perf4ce mentioned this issue Sep 5, 2022

memory size incorrectly estimated when running inside docker #3478

Closed

jjyao added the core Issues that should be addressed in Ray Core label Oct 25, 2022

shchur mentioned this issue Jan 20, 2024

Fix CI failures facebookresearch/hydra#2842

Merged

10 tasks

jjyao added the triage Needs triage (eg: priority, bug/not-bug, and owning component) label Feb 14, 2024

fishbone added core-ux P1 Issue that should be fixed within a few weeks and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Mar 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: After taking into account object store and redis memory usage, the amount of memory on this node available for tasks and actors (-0.01 GB) is less than 0% of total. You can adjust these settings with ray.init(memory=<bytes>, object store memory=<bytes> #11966

ValueError: After taking into account object store and redis memory usage, the amount of memory on this node available for tasks and actors (-0.01 GB) is less than 0% of total. You can adjust these settings with ray.init(memory=<bytes>, object store memory=<bytes> #11966

LianShuaiLong commented Nov 12, 2020 •

edited by rkooo567

Loading

rkooo567 commented Nov 12, 2020

rkooo567 commented Nov 12, 2020

LianShuaiLong commented Nov 13, 2020

rkooo567 commented Nov 13, 2020

LianShuaiLong commented Nov 13, 2020

LianShuaiLong commented Nov 13, 2020

rkooo567 commented Nov 14, 2020

LianShuaiLong commented Nov 15, 2020

LianShuaiLong commented Dec 2, 2020

rahulmadanraju commented Nov 11, 2021

rkooo567 commented Nov 11, 2021

rahulmadanraju commented Nov 11, 2021 •

edited

Loading

rkooo567 commented Nov 11, 2021

rahulmadanraju commented Nov 11, 2021

orcahmlee commented Apr 8, 2022

scottsun94 commented Oct 17, 2022 •

edited

Loading

chongxiaoc commented Oct 20, 2022

scottsun94 commented Oct 21, 2022

chongxiaoc commented Oct 21, 2022

oscartackstrom commented Nov 10, 2022

rkooo567 commented Nov 17, 2022 •

edited

Loading

ValueError: After taking into account object store and redis memory usage, the amount of memory on this node available for tasks and actors (-0.01 GB) is less than 0% of total. You can adjust these settings with ray.init(memory=<bytes>, object store memory=<bytes> #11966

ValueError: After taking into account object store and redis memory usage, the amount of memory on this node available for tasks and actors (-0.01 GB) is less than 0% of total. You can adjust these settings with ray.init(memory=<bytes>, object store memory=<bytes> #11966

Comments

LianShuaiLong commented Nov 12, 2020 • edited by rkooo567 Loading

What is the problem?

Reproduction (REQUIRED)

rkooo567 commented Nov 12, 2020

rkooo567 commented Nov 12, 2020

LianShuaiLong commented Nov 13, 2020

rkooo567 commented Nov 13, 2020

LianShuaiLong commented Nov 13, 2020

LianShuaiLong commented Nov 13, 2020

rkooo567 commented Nov 14, 2020

LianShuaiLong commented Nov 15, 2020

LianShuaiLong commented Dec 2, 2020

rahulmadanraju commented Nov 11, 2021

rkooo567 commented Nov 11, 2021

rahulmadanraju commented Nov 11, 2021 • edited Loading

rkooo567 commented Nov 11, 2021

rahulmadanraju commented Nov 11, 2021

orcahmlee commented Apr 8, 2022

scottsun94 commented Oct 17, 2022 • edited Loading

chongxiaoc commented Oct 20, 2022

scottsun94 commented Oct 21, 2022

chongxiaoc commented Oct 21, 2022

oscartackstrom commented Nov 10, 2022

rkooo567 commented Nov 17, 2022 • edited Loading

LianShuaiLong commented Nov 12, 2020 •

edited by rkooo567

Loading

rahulmadanraju commented Nov 11, 2021 •

edited

Loading

scottsun94 commented Oct 17, 2022 •

edited

Loading

rkooo567 commented Nov 17, 2022 •

edited

Loading