Skip to content

Reduce block size in transcript-former#234

Merged
giogix2 merged 1 commit intomainfrom
reduce_block_size_in_transcriptformer
May 21, 2025
Merged

Reduce block size in transcript-former#234
giogix2 merged 1 commit intomainfrom
reduce_block_size_in_transcriptformer

Conversation

@giogix2
Copy link
Contributor

@giogix2 giogix2 commented May 21, 2025

Issue

When fine-tuning the transcriptformer models sometimes we get GPU shared memory issue. Specifically, the error looks like:
OutOfResources: out of resource: shared memory, Required: 147456, Hardware limit: 101376. Reducing block sizes or num_stages may help.. This can happen in particular when using a GPU other than a H100.

As explained in here, one possible solution consists in limiting the block size in the Triton kernel.

Solution

Reduce the block size in the Triton kernel.

Test

This was tested on a A6000 GPU.

@giogix2 giogix2 marked this pull request as ready for review May 21, 2025 15:28
@giogix2 giogix2 merged commit f198280 into main May 21, 2025
6 checks passed
@giogix2 giogix2 deleted the reduce_block_size_in_transcriptformer branch May 21, 2025 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants