Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wav2Vec2ForCTC weight mismatch #30628

Closed
2 of 4 tasks
MahmoudAshraf97 opened this issue May 2, 2024 · 1 comment
Closed
2 of 4 tasks

Wav2Vec2ForCTC weight mismatch #30628

MahmoudAshraf97 opened this issue May 2, 2024 · 1 comment

Comments

@MahmoudAshraf97
Copy link

MahmoudAshraf97 commented May 2, 2024

System Info

  • transformers version: 4.40.1
  • Platform: Linux-6.1.58+-x86_64-with-glibc2.35
  • Python version: 3.10.12
  • Huggingface_hub version: 0.20.3
  • Safetensors version: 0.4.3
  • Accelerate version: not installed
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.2.1+cu121 (False)
  • Tensorflow version (GPU?): 2.15.0 (False)
  • Flax version (CPU?/GPU?/TPU?): 0.8.3 (cpu)
  • Jax version: 0.4.26
  • JaxLib version: 0.4.26
  • Using GPU in script?: False
  • Using distributed or parallel set-up in script?: False

Who can help?

Hi everyone,
@patrickvonplaten , @sanchit-gandhi
almost all Wav2Vec2 Models have a weight mismatch and give this error

Some weights of the model checkpoint at jonatasgrosman/wav2vec2-large-xlsr-53-arabic were not used when initializing Wav2Vec2ForCTC: ['wav2vec2.encoder.pos_conv_embed.conv.weight_g', 'wav2vec2.encoder.pos_conv_embed.conv.weight_v']
- This IS expected if you are initializing Wav2Vec2ForCTC from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing Wav2Vec2ForCTC from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of Wav2Vec2ForCTC were not initialized from the model checkpoint at jonatasgrosman/wav2vec2-large-xlsr-53-arabic and are newly initialized: ['wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original0', 'wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original1']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

I've tried multiple models from official and non official sources and all have the same issue with both pytorch and safetensors variant

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor

MODEL_ID = "jonatasgrosman/wav2vec2-large-xlsr-53-arabic"
processor = Wav2Vec2Processor.from_pretrained(MODEL_ID)
model = Wav2Vec2ForCTC.from_pretrained(MODEL_ID)

the problem exists in all versions starting from this commit bc9ecef

Expected behavior

The model to load successfully

@MahmoudAshraf97
Copy link
Author

After searching I found that it's a duplicate of #26796 and #27605

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant