Skip to content

[NEEDS TESTING] Update dependencies#164

Open
FWao wants to merge 7 commits intomainfrom
fix/build-5
Open

[NEEDS TESTING] Update dependencies#164
FWao wants to merge 7 commits intomainfrom
fix/build-5

Conversation

@FWao
Copy link
Member

@FWao FWao commented Mar 23, 2026

This PR updates Python to 3.13.

To avoid building causal-conv1d & mamba-ssm we hard-coded wheels for linux x86_64 and aarch64 - specific for torch 2.10, CUDA 13.0 and Python 3.13.

Unfortunately as no current flash-attn wheels exist, we must build flash-attn ourselves which can take a long time. To speed up the installation, the gpu dependency group is now installed without flash_attn! Users must use gpu_all to build flash_attn to use some of the models!

Some smaller changes were necessary to remain compatibility with recent timm & torch version!

This PR was tested on a DGX Spark!

@FWao FWao requested a review from mducducd March 23, 2026 14:26
@mducducd
Copy link
Collaborator

I have run tests on real data, and so far no major regressions. The core workflows completed as expected.
With gpu_all, flash-attn installation takes quite a long time, but that seems to be the practical trade-off here.

@FWao
Copy link
Member Author

FWao commented Mar 26, 2026

I have run tests on real data, and so far no major regressions. The core workflows completed as expected. With gpu_all, flash-attn installation takes quite a long time, but that seems to be the practical trade-off here.

Thank you!

I think we could provide a repo where we provide pre-built flash-attn wheels for Python 3.13 / torch 2.10 / cuda 13 for x86_64 and aarch64. Maybe similar to here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants