Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An error was encountered while loading "pyannote/speaker-diarization" #1128

Closed
Zpadger opened this issue Oct 29, 2022 · 19 comments
Closed

An error was encountered while loading "pyannote/speaker-diarization" #1128

Zpadger opened this issue Oct 29, 2022 · 19 comments

Comments

@Zpadger
Copy link

Zpadger commented Oct 29, 2022

Hello,when i run the code :

from pyannote.audio import Pipeline
pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization",
                                    use_auth_token="my_token")

I get an error :

Traceback (most recent call last):
  File "/home/dg/anaconda3/envs/pyannote/lib/python3.8/site-packages/huggingface_hub/utils/_errors.py", line 213, in hf_raise_for_status
    response.raise_for_status()
  File "/home/dg/anaconda3/envs/pyannote/lib/python3.8/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://huggingface.co/pyannote/segmentation/resolve/2022.07/pytorch_model.bin

whether I use the read token role or the write token role.
Anyone else know how to fix it? Thx.

@micahjon
Copy link
Contributor

Thanks for posting! I'm running into a similar error when using a read token:

pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization",
                                            use_auth_token="hf_E...rest of token here")
401 Client Error: Repository Not Found for url: https://huggingface.co/pyannote/segmentation/resolve/2022.07/pytorch_model.bin. 
If the repo is private, make sure you are authenticated

I'm able to access the segmentation repository just fine in my browser and agreed to the license, but for some reason the token isn't working.

@pooya-mohammadi
Copy link

the same error here

@mezaros
Copy link

mezaros commented Oct 29, 2022

Same error here, whatever is going on with the tokens is broken and the project has been unusable since 10/27.

@pooya-mohammadi
Copy link

pooya-mohammadi commented Oct 29, 2022

@micahjon
@Zpadger
@mezaros
use_auth_token works fine. You simply need to update your pyannote.audio because in the new version get_model takes use_auth-token as an input and it works fine. This is added in the new release, so a simple update to the latest version would solve the problem!

@JoFrhwld
Copy link

JoFrhwld commented Oct 29, 2022

It looks like you need to fill out the user agreement on both hf.co/pyannote/speaker-diarization and hf.co/pyannote/segmentation in order to use either one.

If I read the rationale for gating the model on HF correctly, this is strictly a data gathering exercise. It introduces too high a friction for me to recommend the system, or to utilize it as a dependency in any way.

@raulqf
Copy link

raulqf commented Oct 30, 2022

It seems to be working. I've installed a new environment and reproduced the diarization example.

@subtyping
Copy link

Installed new environment and was able to get things working. I was still encountering the error after accepting both agreements listed by @JoFrhwld - so not sure if agreeing to both of those is needed or not. Either way, fresh install works!

@MrEdwards007
Copy link

It works now.

I had uninstalled and reinstalled a few times but that did not by itself resolve the issue.
When this was previously working a few days ago before the update, I had already accepted the diarization agreement.
What did resolve the issue was also accepting the segmentation agreement, as @JoFrhwld suggested.

@mezaros
Copy link

mezaros commented Oct 31, 2022

I'm running the correct 2.1 version, I did go through both diarization and segmentation gateways even though it wasn't specified, I have updated my token a hundred times — and, with nothing changing in my environment, the error has migrated from 403 forbidden documented above to a new SemVer error. Sounds like I need to completely wipe my environment and start over, even though I was already on the correct versions.

Bugs I understand. But why this friction in the first place? It feels developer and user hostile.

@Frascth
Copy link

Frascth commented Oct 31, 2022

same problem, solved with
in my case, agreeing all model in hugging face pyannote.audio, use the read token, huggingface-cli login using created token, and finnaly upgrade the pyannote.audio library to 2.1.1 using (pip install --upgrade pyannote.audio) solved my problem

@cetiny
Copy link

cetiny commented Oct 31, 2022

Agreeing to all the models on Huggingface resolved the issue for me (thanks @JoFrhwld ). Any tips how to save the model locally and access from cache? I don't want my code to be broken like this again.

@MrEdwards007
Copy link

Yes, I thought the model was downloaded and really want this to work offline.

@shashankmc
Copy link

Updating to the latest version of the library and generating use_auth_token for both speaker-diarization and segmentation seems to do the trick. Tested if using certain auth token would create an issue but it doesn't matter which access token is provided once both are generated.

@mezaros
Copy link

mezaros commented Nov 1, 2022

Got it working with a full environment reset.

But, no longer enthused about testing this. We could never, ever rely on it or ask anyone else to, after this experience. The models need to work offline.

@pranjal-zipteams
Copy link

It works now.

I had uninstalled and reinstalled a few times but that did not by itself resolve the issue. When this was previously working a few days ago before the update, I had already accepted the diarization agreement. What did resolve the issue was also accepting the segmentation agreement, as @JoFrhwld suggested.

This fixes the issue for me.

@plandrobe
Copy link

https://huggingface.co/pyannote/segmentation/resolve/2022.07/pytorch_model.bin.

Same here. I have already accepted the terms in both repos, download a new token, updated hugging face hub and pyannote.audio to the last versión. The api response is generic for when the url or model is not found (it doesn't matter if the problem is the token or not). Look at the url of the pretrained model in the error that ends with bin. (a dot at the end). If you try the url without the dot in the browser you can download the file fine. Looks like a bug, at some point the code is adding a . (dot) in the file request.

@makkasu
Copy link

makkasu commented Nov 24, 2022

Some variant on the issue here: I've signed up on the hub to the terms for both models, generated the auth token and I can use it locally on my laptop. However, when I try to run it in an Azure VM, it just hangs on pipeline = Pipeline.from_pretrained('pyannote/speaker-diarization', use_auth_token=... until the max HTTPS retries is exceeded. Quite frustrating, can't really proceed with development using this tool until this hurdle is removed. Note that I can load other huggingface models just fine, but they don't use the auth token business.

hbredin added a commit that referenced this issue Nov 29, 2022
* related bugs: #1119 #1128 #1130
* related discussions: #1123 #1103 #1126 #1121
@hbredin
Copy link
Member

hbredin commented Nov 29, 2022

I have just updated the FAQ with instructions on how to use pretrained models and pipelines offline.

@hbredin hbredin closed this as completed Nov 29, 2022
@pri1712
Copy link

pri1712 commented Jul 9, 2024

Upgrading my pyannote.audio version worked for me as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests