[Fix] Inferring a default sequence length during transformers export #1856

dbogunowicz · 2023-11-28T09:25:36Z

This PR: #1826 has added new functionality so that when no sequence_length is being specified, we are defaulting to a max sequence length from the config. Unfortunately, as discussed in the PR comments of 1826, indeed - not only an HF config is not guaranteed to have max_position_embeddings attribute, but this information can be present under a different key value.
E.g
For the model TinyLlama-1.1B-Chat-v0.3 we are looking at config.max_position_embeddings
For the model zoo:mpt-7b-mpt_chat_mpt_pretrain-pruned80_quantized we are looking at config.max_seq_len

This PR:

Adds max_seq_len to the set of config attributes that may be potentially used to infer the default sequence_length
Adds the behaviour that raises a ValueError if we are not able to infer the default sequence_length (the user should specify it manually then)

src/sparseml/transformers/export.py

derekk-nm · 2023-11-28T12:04:12Z

src/sparseml/transformers/export.py

    )

    if sequence_length is None:
+        if hasattr(config, "max_position_embeddings"):


Could a config have both attributes? If so, is max_position_embeddings preferred over max_seq_len, as coded here?

Good catch. Regarding both keys being simultaneously present- honestly do not know, I would need to go through more example configs. But preferring max_position_embeddings or max_seq_len sounds reasonable to me.

initial commit

4bd7369

dbogunowicz mentioned this pull request Nov 28, 2023

[Cherry-Pick][Fix] Inferring a default sequence length during transformers export #1857

Merged

dbogunowicz commented Nov 28, 2023

View reviewed changes

src/sparseml/transformers/export.py Outdated Show resolved Hide resolved

Update src/sparseml/transformers/export.py

3cc37c9

dbogunowicz requested review from bfineran, derekk-nm and mgoin November 28, 2023 09:31

derekk-nm reviewed Nov 28, 2023

View reviewed changes

bfineran approved these changes Nov 28, 2023

View reviewed changes

dbogunowicz requested a review from derekk-nm November 29, 2023 10:41

Merge branch 'main' into feature/damian/fix_default_sequence_length

51d68e9

rahul-tuli approved these changes Nov 29, 2023

View reviewed changes

Merge branch 'main' into feature/damian/fix_default_sequence_length

1961c28

bfineran merged commit 8f2296f into main Nov 29, 2023

bfineran deleted the feature/damian/fix_default_sequence_length branch November 29, 2023 15:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Fix] Inferring a default sequence length during transformers export #1856

[Fix] Inferring a default sequence length during transformers export #1856

Uh oh!

dbogunowicz commented Nov 28, 2023 •

edited

Loading

Uh oh!

Uh oh!

derekk-nm Nov 28, 2023

Uh oh!

dbogunowicz Nov 28, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Fix] Inferring a default sequence length during transformers export #1856

[Fix] Inferring a default sequence length during transformers export #1856

Uh oh!

Conversation

dbogunowicz commented Nov 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

derekk-nm Nov 28, 2023

Choose a reason for hiding this comment

Uh oh!

dbogunowicz Nov 28, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dbogunowicz commented Nov 28, 2023 •

edited

Loading