Implementing HF Padding-Free and GraniteLM Support#257
Conversation
3bcd40f to
705ad43
Compare
3bb42af to
9de3409
Compare
RobotSail
left a comment
There was a problem hiding this comment.
A few comments but it looks good so far. Just need to test it
|
This pull request has merge conflicts that must be resolved before it can be |
9460378 to
8e3f553
Compare
JamesKunstle
left a comment
There was a problem hiding this comment.
Generally seems like a solid drop-in replacement with some good updates. First round of review only covers code- I'll run everything on AMD hardware as well.
94d3adb to
a45e82b
Compare
Signed-off-by: Mustafa Eyceoz <meyceoz@redhat.com>
Signed-off-by: Mustafa Eyceoz <meyceoz@redhat.com>
|
@JamesKunstle added comments for the things you wanted |
|
@Maxusmusti Much appreciated- testing on AMD now |
|
Testing passes on AMD, losses between Dolomite and Llama are in parity. |
JamesKunstle
left a comment
There was a problem hiding this comment.
Tests pass and loss curves look good between Dolomite and Llama padding-free on AMD. Feel good about merging this.
| labels = sample["labels"] | ||
| input_ids = sample["input_ids"] | ||
| mask_id = get_sp_token(tokenizer, "<MASK>")[0] | ||
| mask_id = get_sp_token(tokenizer, "<|MASK|>")[0] |
There was a problem hiding this comment.
Will this affect existing models? Or is this purely for training-time?
There was a problem hiding this comment.
Yeah only relevant during training
Updating the data collator for models with HF padding-free support, adding support for upcoming Granite HF model class, and updating flags/interface accordingly.
-Mustafa