shared torch.tensor with multiprocesses using python Queue cause coredump #56480

jackzhou121 · 2021-04-20T13:49:51Z

🐛 Bug

Our process crashed when send torch.tensor with Queue, but when the torch.tensor is converted to numpy, the process works good.

When we send small torch.tensor, after couple of times of success, the process crash again!

To Reproduce

Steps to reproduce the behavior:

1 .run the following code on linux with kernel 3.10.0-693.el7.x86_64

import time
from multiprocessing import Process
from multiprocessing.managers import SyncManager
from queue import PriorityQueue
import torch

class MyManager(SyncManager):
    pass
MyManager.register("PriorityQueue", PriorityQueue)  # Register a shared PriorityQueue


def Manager():
    class PipelineManager(SyncManager):
        pass

    PipelineManager.register("PriorityQueue", PriorityQueue)
    m = PipelineManager()
    m.start()
    return m

m = Manager()
pr_queue = m.PriorityQueue()

mytensor = torch.ones((153, 3, 224, 224), dtype=torch.float32)*3.141592

print(mytensor.shape)

pr_queue.put({"data": mytensor})

print("put data done")

time.sleep(600)

Expected behavior

the process with queue.put will crashed.

Environment

python 3.6
linux kernel 3.10.0-693.el7.x86_64
torch 1.2.0

Please copy and paste the output from our
environment collection script
(or fill out the checklist below manually).

You can get the script and run it with:

wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py

PyTorch Version (e.g., 1.0):
OS (e.g., Linux):
How you installed PyTorch (conda, pip, source):
Build command you used (if compiling from source):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Additional context

cc @ezyang

The text was updated successfully, but these errors were encountered:

ezyang · 2021-04-21T20:44:12Z

Can you try using torch.multiprocessing instead of stock multiprocessing?

jackzhou121 · 2021-04-25T02:14:22Z

torch.multiprocessing

torch.multiprocessing has no syncManager package

ezyang · 2021-04-26T15:25:44Z

ok filed an issue for this #56921

ezyang · 2021-04-26T15:34:12Z

Do you really need a priority queue here?

jackzhou121 changed the title ~~shared torch.tensor() with multiprocesses using python Queu cause coredump~~ shared torch.tensor() with multiprocesses using python Queue cause coredump Apr 20, 2021

jackzhou121 changed the title ~~shared torch.tensor() with multiprocesses using python Queue cause coredump~~ shared torch.tensor with multiprocesses using python Queue cause coredump Apr 20, 2021

heitorschueroff added the shadow review Request the triage shadow to take a second look at your triage and see if they agree or not label Apr 20, 2021

ailzhang added module: multiprocessing Related to torch.multiprocessing triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Apr 23, 2021

ezyang mentioned this issue Apr 26, 2021

torch.multiprocessing implement SyncManager #56921

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

shared torch.tensor with multiprocesses using python Queue cause coredump #56480

shared torch.tensor with multiprocesses using python Queue cause coredump #56480

jackzhou121 commented Apr 20, 2021 •

edited by ezyang

ezyang commented Apr 21, 2021

jackzhou121 commented Apr 25, 2021

ezyang commented Apr 26, 2021

ezyang commented Apr 26, 2021

shared torch.tensor with multiprocesses using python Queue cause coredump #56480

shared torch.tensor with multiprocesses using python Queue cause coredump #56480

Comments

jackzhou121 commented Apr 20, 2021 • edited by ezyang

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

ezyang commented Apr 21, 2021

jackzhou121 commented Apr 25, 2021

ezyang commented Apr 26, 2021

ezyang commented Apr 26, 2021

jackzhou121 commented Apr 20, 2021 •

edited by ezyang