Skip to content

Activity

eval dataloader and sampler changes

djsaundepushed 1 commit to sequence-parallelism • 39bc435…b8bf3ee • 
1 hour ago

feat: add example gemma3 config

NanoCode012pushed 1 commit to feat/gemma3 • 25c3df9…c1fd37e • 
3 hours ago

fix: temp remove liger unknown model patch

NanoCode012pushed 1 commit to feat/gemma3 • b05de06…25c3df9 • 
4 hours ago

remove debug logs and simplify

Force push
djsaundeforce pushed to sequence-parallelism • 84eb704…39bc435 • 
5 hours ago

see if it passes e2e

bursteratompushed 1 commit to flex_2 • da02bdc…83379c5 • 
7 hours ago

chore(docs): add cookbook/blog link to docs

NanoCode012created doc/cookbook • d86526d • 
8 hours ago

fix: add validation for liger for nonsupported models

NanoCode012pushed 1 commit to feat/gemma3 • cbab811…b05de06 • 
13 hours ago

fixing sample packing

djsaundepushed 1 commit to sequence-parallelism • b018956…84eb704 • 
yesterday

tracking the main

bursteratompushed 1 commit to flex_2 • aa01b0f…da02bdc • 
yesterday

working multi-group SP

Force push
djsaundeforce pushed to sequence-parallelism • d36bee5…b018956 • 
yesterday

e2e test debug - transformers version

bursteratompushed 1 commit to flex_2 • 18adb26…aa01b0f • 
yesterday

only validate hf user token on rank 0

wingliancreated check-user-token-single-rank • 6eac0cf • 
yesterday

e2e test for llama flex attnetion w/ sample packing

bursteratompushed 1 commit to flex_2 • fda49d8…18adb26 • 
yesterday

typo

bursteratompushed 1 commit to flex_2 • 1acbb5a…fda49d8 • 
yesterday

requirements: transformers on main

bursteratompushed 1 commit to flex_2 • dade4d4…1acbb5a • 
yesterday

add pydantic validation for attention fields

bursteratompushed 1 commit to flex_2 • 13d26c9…dade4d4 • 
yesterday

rebased flex attn integration

Force push
bursteratomforce pushed to flex_2 • 5310189…13d26c9 • 
yesterday

remove reference to deprecated import

wingliancreated peft-fixes-20250312 • 3bec79d • 
yesterday

fix: handle system role properly depending on template

NanoCode012pushed 1 commit to feat/gemma3 • 518930f…cbab811 • 
yesterday

fix: format

NanoCode012pushed 1 commit to feat/gemma3 • 2e27406…518930f • 
yesterday

fix: format

NanoCode012pushed 1 commit to feat/gemma3 • ea8e94f…2e27406 • 
yesterday

fix: use tagged gemma3 transformers

NanoCode012pushed 1 commit to feat/gemma3 • 73d0c36…ea8e94f • 
yesterday

fix: rearrange sort order

NanoCode012pushed 1 commit to feat/gemma3 • e134954…73d0c36 • 
yesterday

feat: add gemma2 e2e test

NanoCode012created feat/gemma3 • e134954 • 
yesterday

test import-wihtin-import relative path

bursteratomcreated preprocess_grpo-fix • 83fcaba • 
yesterday

Built site for gh-pages

Force push
github-actions[bot]force pushed to gh-pages • 6b8523c…e6f6390 • 
2 days ago

Built site for gh-pages

github-actions[bot]pushed 1 commit to gh-pages • 6e1ad11…6b8523c • 
2 days ago

Deleted branch

use max of 32 dataset processes if not explicit (#2403)

Pull request merge
winglianpushed 1 commit to main • 59899b9…f0072f3 • 
2 days ago

Deleted branch

wingliandeleted fix-untrained-w-zero3 • 
2 days ago