Skip to content

Conversation

@thomwolf
Copy link
Member

@thomwolf thomwolf commented Feb 5, 2024

Support for Nanotron models

@NathanHB NathanHB changed the title Fixing quality Fixing quality and greedy untill for brrr Feb 5, 2024
@thomwolf thomwolf changed the title Fixing quality and greedy untill for brrr Support for nanotron Feb 6, 2024
dataset = Dataset.from_list(
[{k: str(v) for k, v in asdict(detail).items()} for detail in task_details]
)
# try:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you remove the high level try catch, please add other try catches to prevent the other possible failures

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure we want to silently catch mistake or should we not rather let the run fail?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because we still want the results to be saved locally. That way we can upload them by hand instead of having to redo the whole eval.



class GenerativeTaskDatasetNanotron(DynamicBatchDataset):
def __getitem__(self, index) -> Request:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need your own class? (Is it only to return the index with the item?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nathan's requirement

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

base_model does not use the index for each sample, that means that we need to accommodate the dataset to nanotron.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but I'm unsure why we need to grab the index for brr

config_cls: Type = Config,
model_config_cls: Optional[Type] = None,
model_cls: Optional[Type] = None,
):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation should be added

clefourrier and others added 5 commits February 7, 2024 15:04
* add management with env config

* use markdown table generator

* doc of the s3 cleaning function

* fix name
@NathanHB NathanHB merged commit 8aaf51c into main Feb 7, 2024
hynky1999 pushed a commit that referenced this pull request May 22, 2025
Support for Nanotron models

---

Co-authored-by: Nathan Habib <nathan.habib@huggingface.co>
Co-authored-by: clementine@huggingface.co <clementine@huggingface.co>
Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com>
NathanHB added a commit that referenced this pull request Sep 19, 2025
Support for Nanotron models

---

Co-authored-by: Nathan Habib <nathan.habib@huggingface.co>
Co-authored-by: clementine@huggingface.co <clementine@huggingface.co>
Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants