Skip to content

Tensorboard with profiler installed spawns hundereds of python tasks and crashes machine #644

@NRHuntoon

Description

@NRHuntoon

I'm attempting to run the profiler and on multiple machines now when I run Tensorboard after installing the profiler from PyPi, the machines crash due to an excessive amount of python tasks spawining. If I run Tensorboard prior to installing the profiler it runs fine, so I'm pretty sure this is a problem with my installation of the profiler.

Machine 1: Docker
base Docker Image: nvidia/cuda:11.6.2-devel-ubuntu20.04 CUDA 11.6.2, Ubuntu 20.04
Python 3.8
Torch 1.12

This Docker image does training and inference with torch and CUDA fine so I'm confident in the image and the underlying system.

Machine 2: Desktop PC
Arch Linux
Anaconda Environment created just to test this
CUDA 11.7
Python 3.10
PyTorch 1.12

On both of these machines I can run Tensorbarod fine, but if I run pip install torch_tb_profiler
The next time I load Tensorboard it crashes the system. Running 'top' in another terminal shows a flood of "python" tasks that utilize the full processor resources available.

If I 'CTR-C' on the tensorboard command the machine will eventually kill all these processes and recover, but it's unusable.

image

I can't find any other instances of this on the web

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingpluginPyTorch Profiler TensorBoard Plugin related

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions