Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[extractor/vocaroo] Add Vocaroo extractor #6117

Merged
merged 9 commits into from
Feb 17, 2023
Merged

Conversation

qbnu
Copy link
Contributor

@qbnu qbnu commented Jan 30, 2023

IMPORTANT: PRs without the template will be CLOSED

Description of your pull request and other information

(Description from ytdl-org/youtube-dl#29369)

Adds support for Vocaroo, "the premier voice recording service." The site lacks any sort of metadata for its uploads and all come down as MP3s, so much of this extractor is hardcoded, making it quite fast. While Vocaroo does allow you to easily download individual files from their site, youtube-dl support will allow users to extract Vocaroo embeds with the generic extractor and download many Vocaroo files at once.

Template

Before submitting a pull request make sure you have:

In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

yt_dlp/extractor/vocaroo.py Show resolved Hide resolved
yt_dlp/extractor/vocaroo.py Outdated Show resolved Hide resolved
yt_dlp/extractor/vocaroo.py Outdated Show resolved Hide resolved
@pukkandan pukkandan added the site-request Request to support a new website label Jan 30, 2023
@pukkandan
Copy link
Member

youtube-dl support will allow users to extract Vocaroo embeds with the generic extractor and download many Vocaroo files at once.

The current extractor doesn't seem to support embeds. Provide example

@pukkandan pukkandan added the pending-fixes PR has had changes requested label Jan 30, 2023
@qbnu
Copy link
Contributor Author

qbnu commented Jan 30, 2023

I added your suggestions, but I want to mention that IDs don't have underscores, and the filename is in the format Vocaroo {id}.mp3 when you click the Download button on the site.

@qbnu
Copy link
Contributor Author

qbnu commented Feb 1, 2023

There is an x-bz-upload-timestamp header in the response which has the UNIX timestamp of when each file was uploaded. Should this be used as the value for timestamp, and is there a way to do it without sending an extra request?

@qbnu qbnu requested a review from pukkandan February 3, 2023 18:56
@pukkandan pukkandan removed the pending-fixes PR has had changes requested label Feb 3, 2023
@bashonly bashonly linked an issue Feb 4, 2023 that may be closed by this pull request
10 tasks
@dirkf
Copy link
Contributor

dirkf commented Feb 4, 2023

You can can make a HEAD request in the extractor (or GET if the site doesn't support it) and get the header value out of the response. Any other solution would be overkill.

Actually, sending the request might be a good thing. The extractor currently just guesses that an audio file exists with the given ID, whereas making at least one request will reveal whether the ID is valid (or perhaps that the extractor's guess is no longer valid).

In my test I found that a media1.... URL redirected to media.....

@qbnu
Copy link
Contributor Author

qbnu commented Feb 4, 2023

I made the logic for the media subdomain match the logic of the site

function (e) {
    if (e.length) {
        if (11 == e.length)
            return ControlConfig.mediaMp3FileUrl;
        if (12 == e.length && '1' == e[0])
            return ControlConfig.mediaMp3FileUrl1;
        if (10 == e.length)
            return ControlConfig.mediaMp3FileUrl1
    }
    return ControlConfig.mediaMp3FileUrl
}

In my test I found that a media1.... URL redirected to media.....

Can you give an example? I assumed they were mirrors.

yt_dlp/extractor/vocaroo.py Outdated Show resolved Hide resolved
yt_dlp/extractor/vocaroo.py Outdated Show resolved Hide resolved
@pukkandan pukkandan added the pending-fixes PR has had changes requested label Feb 12, 2023
Co-authored-by: pukkandan <pukkandan.ytdlp@gmail.com>
@pukkandan pukkandan removed the pending-fixes PR has had changes requested label Feb 13, 2023
@pukkandan pukkandan merged commit e4a8b17 into yt-dlp:master Feb 17, 2023
@SuperSonicHub1
Copy link
Contributor

Hey! Looks like some more of my code made it into yt-dlp. I have no issue with my code being used without permission—it is public domain after all—but being mentioned would have been nice; would've been happy to help and would have closed the PR in youtube-dl. That's all; sorry for waking a sleeping thread.

@pukkandan
Copy link
Member

@SuperSonicHub1 You are credited in commit message and changelog

@SuperSonicHub1
Copy link
Contributor

Thanks, @pukkandan!

aalsuwaidi pushed a commit to aalsuwaidi/yt-dlp that referenced this pull request Apr 21, 2024
Authored by: qbnu, SuperSonicHub1
Closes yt-dlp#6152
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
site-request Request to support a new website
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for Vocaroo
4 participants