Skip to content

Failed to run language_model in GPU other than device 0 #3

@byzhang

Description

@byzhang

The server has multiple TitanX cards, can language_model can run in device 0, but not others.
The error in nivida side is

GPU 0000:04:00.0: Detected Critical Xid Error

In language_model side is

    Vocabulary size = 10002 (occuring more than 1)
Max training epochs = 2000
    Training cutoff = -1
  Number of threads = 1
     minibatch size = 100
       max_patience = 5
             device = gpu
Load location         = N/A
Constructed Stacked LSTMs
Vocabulary size       = 10002
Input size            = 100
Output size           = 10002
Stack size            = 4
Shortcut connections  = true
Memory feeds gates    = true
an illegal memory access was encountered

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions