Skip to content

Commit

Permalink
Merge branch 'disable-128-wide-cutlass-lstm' into 'master'
Browse files Browse the repository at this point in the history
Disable Cutlass LSTM codepath for 128 wide layers, as this kernel is broken

See merge request machine-learning/dorado!672
  • Loading branch information
GKolling committed Nov 2, 2023
2 parents 7c1c0f0 + 2239986 commit 8fb8a4d
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion dorado/nn/CRFModel.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ static LstmMode get_cuda_lstm_mode(int layer_idx, int layer_size) {
bool is_TX2 = (prop->major == 6 && prop->minor == 2);
bool is_A100_H100 = ((prop->major == 8 || prop->major == 9) && prop->minor == 0);

if (is_A100_H100 && layer_size <= 1024 && (layer_size % 128) == 0) {
if (is_A100_H100 && layer_size <= 1024 && layer_size > 128 && (layer_size % 128) == 0) {
return (layer_idx == 0) ? LstmMode::CUTLASS_TNC_F16 : LstmMode::CUTLASS_TNC_I8;
} else if (!is_TX2 && (layer_size == 96 || layer_size == 128)) {
return LstmMode::QUANTISED_NTC;
Expand Down

0 comments on commit 8fb8a4d

Please sign in to comment.