Skip to content

WSL Crash for multi-gpu usage #8479

@naruto0426

Description

@naruto0426

Version

Microsoft Windows [Version 10.0.19044.1706]

WSL Version

  • WSL 2
  • WSL 1

Kernel Version

5.10.102.1

Distro Version

Ubuntu 20.04

Other Software

No response

Repro Steps

pytorch 1.10.0+ cuda 10.2(from wsl2 kernel)

Expected Behavior

Work normal for multi-gpu usage

Actual Behavior

When training with pytorch using multiple gpus, it will crash about 10-20 minutes.
I have try 2 cases.
First, train with multiple gpus with ddp in pytorch.
Second, train with one gpu in pytorch and open 2 program to train different model.
Both of them will crash in 20 minutes.

Diagnostic Logs

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions