Activity
Bump ring from 0.17.11 to 0.17.13
Bump ring from 0.17.11 to 0.17.13
Optimize non-mla with cat
Optimize non-mla with cat
Fix launch of blockwise fp8 dequant
Fix launch of blockwise fp8 dequant
Just save the progress
Just save the progress
Merge branch 'master' into deepseekv3_fixes
Merge branch 'master' into deepseekv3_fixes
Some fixes, bump synchronize limit
Some fixes, bump synchronize limit
Remove gpu<>cpu sync for faster long-context
Remove gpu<>cpu sync for faster long-context