Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: We expect a numpy ndarray as input, got <class 'NoneType'> #89

Closed
PiotrEsse opened this issue May 7, 2024 · 6 comments · Fixed by #92
Closed

ValueError: We expect a numpy ndarray as input, got <class 'NoneType'> #89

PiotrEsse opened this issue May 7, 2024 · 6 comments · Fixed by #92
Assignees
Labels
bug Something isn't working

Comments

@PiotrEsse
Copy link

Hi,
I am trying an example of Youtube to text. I am getting following error.

024-05-07 07:58:17,383 - WARNING - /home/piotr/anaconda3/envs/WhisperPlus38/lib/python3.8/site-packages/pyannote/audio/core/io.py:43: UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call.
  torchaudio.set_audio_backend("soundfile")

2024-05-07 07:58:19,889 - ERROR - An error occurred: __init__: could not find match for ^\w+\W
2024-05-07 07:58:20,578 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-05-07 07:58:21,842 - INFO - Model loaded successfully.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
2024-05-07 07:58:22,965 - INFO - Transcribing audio...
Traceback (most recent call last):
  File "/home/piotr/WhisperPlus/Tutorial/YoutubeToText.py", line 31, in <module>
    transcript = pipeline(
  File "/home/piotr/anaconda3/envs/WhisperPlus38/lib/python3.8/site-packages/whisperplus/pipelines/whisper.py", line 91, in __call__
    result = pipe(audio_path)
  File "/home/piotr/anaconda3/envs/WhisperPlus38/lib/python3.8/site-packages/transformers/pipelines/automatic_speech_recognition.py", line 285, in __call__
    return super().__call__(inputs, **kwargs)
  File "/home/piotr/anaconda3/envs/WhisperPlus38/lib/python3.8/site-packages/transformers/pipelines/base.py", line 1234, in __call__
    return next(
  File "/home/piotr/anaconda3/envs/WhisperPlus38/lib/python3.8/site-packages/transformers/pipelines/pt_utils.py", line 124, in __next__
    item = next(self.iterator)
  File "/home/piotr/anaconda3/envs/WhisperPlus38/lib/python3.8/site-packages/transformers/pipelines/pt_utils.py", line 269, in __next__
    processed = self.infer(next(self.iterator), **self.params)
  File "/home/piotr/anaconda3/envs/WhisperPlus38/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 631, in __next__
    data = self._next_data()
  File "/home/piotr/anaconda3/envs/WhisperPlus38/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 675, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/home/piotr/anaconda3/envs/WhisperPlus38/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 32, in fetch
    data.append(next(self.dataset_iter))
  File "/home/piotr/anaconda3/envs/WhisperPlus38/lib/python3.8/site-packages/transformers/pipelines/pt_utils.py", line 186, in __next__
    processed = next(self.subiterator)
  File "/home/piotr/anaconda3/envs/WhisperPlus38/lib/python3.8/site-packages/transformers/pipelines/automatic_speech_recognition.py", line 410, in preprocess
    raise ValueError(f"We expect a numpy ndarray as input, got `{type(inputs)}`")
ValueError: We expect a numpy ndarray as input, got `<class 'NoneType'>`

@kadirnar
Copy link
Owner

kadirnar commented May 7, 2024

Can you share your code? It gives this error because it is not a .mp3 file.

@kadirnar kadirnar self-assigned this May 7, 2024
@kadirnar kadirnar added the bug Something isn't working label May 7, 2024
@kadirnar
Copy link
Owner

kadirnar commented May 7, 2024

Can't download Youtube video. It may be related to library versions. I will test it.

2024-05-07 07:58:19,889 - ERROR - An error occurred: __init__: could not find match for ^\w+\W

@kadirnar
Copy link
Owner

kadirnar commented May 7, 2024

There is a bug with the Pytube library. I will solve this error.
pytube/pytube#1201

@PiotrEsse
Copy link
Author

PiotrEsse commented May 7, 2024

Code is taken straight from Your example - no changes.

If I hardcode a file in audio_path = "/home/piotr/WhisperPlus/Tutorial/zycie.mp3" then its working.

@kadirnar
Copy link
Owner

kadirnar commented May 7, 2024

This function was working for 1-2 days, but it gives an error. I can't download videos either. I don't know why. Pytube is not an up-to-date library. I started writing a different library code.

Now you can manually download the .mp3 file and test it. There is only a bug with the download function.

@kadirnar kadirnar linked a pull request May 7, 2024 that will close this issue
@kadirnar
Copy link
Owner

kadirnar commented May 7, 2024

I rewrote this function(download_youtube_to_mp3). I tested it and it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants