Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: List Extractors as JSON #17208

Closed
Kikobeats opened this issue Aug 11, 2018 · 1 comment
Closed

Feature Request: List Extractors as JSON #17208

Kikobeats opened this issue Aug 11, 2018 · 1 comment

Comments

@Kikobeats
Copy link

@Kikobeats Kikobeats commented Aug 11, 2018

What is the purpose of your issue?

  • Feature request (request for a new functionality)

Hello, this is a little suggestion that I think could be useful to be possible integrate software with youtube-dl.

I needed a way to determinate what URLs can be extracted using youtube-dl. Something like:

isYouTubeDLSupported('https://google.com') // => false

Using that I can avoid pass urls to youtube-dl that actually doesn't support video extraction.

My problem was how to determinate the list of compatible services.

Although you provide youtube-dl --list-extractors, this is not actually a list of domains services that we can use for compare, but it's something.

I created a tiny script on JavaScript to convert into something more deterministic

'use strict'

const parseDomain = require('parse-domain')
const jsonFuture = require('json-future')
const youtubedl = require('youtube-dl')
const { promisify } = require('util')

const getExtractors = promisify(youtubedl.getExtractors)

;(async () => {
  // it runs `youtube-dl --list-extractors`
  const extractors = await getExtractors()

  const providers = extractors.reduce((set, extractor) => {
    // just get the first item from ':'. e.g: twitter:card → twitter
	// convert all into lowercase
    const provider = extractor.split(':')[0].toLowerCase()
    // try to determinate if the result is a domain
    const { domain = '' } = parseDomain(provider) || {}
    set.add(domain || provider)
    return set
    // remove duplicates
  }, new Set())

  await jsonFuture.saveAsync('providers.json', Array.from(providers))
})()

Here the original source code

I run every time I install my dependencies, so the script keep the providers up to date with the last youtube-dl version. This is an example of output

Then is very easy to check if the url is youtube-dl compatible; just need to extract the domain from the input url and check it into the providers list:

const providers = require('./providers') // auto-regenerated in each youtube-dl version
const isSupportedProvided = url => providers.includes(parseDomain(url).domain)

For me, it makes sense include something like this output as part of youtube-dl.

Something similar to --dump-json that makes easy connect youtube-dl with third party software but oriented for know compatible services.

Maybe --list-extractors-json? I don't know, what do you think 🙂

@dstftw
Copy link
Collaborator

@dstftw dstftw commented Aug 11, 2018

I needed a way to determinate what urls can be extracted using youtube-dl.

You can't do that without running the actual extraction process.

Although you provide youtube-dl --list-extractors actually these list are not domains, so I was to convert the list into a domain list.

This is completely senseless. They are provided not as domain on purpose.

Then is very easy to check if the url is youtube-dl compatible

This won't work as already described.

@dstftw dstftw closed this Aug 11, 2018
Kikobeats added a commit to microlinkhq/metascraper that referenced this issue Aug 11, 2018
Kikobeats added a commit to microlinkhq/metascraper that referenced this issue Aug 11, 2018
Kikobeats added a commit to microlinkhq/metascraper that referenced this issue Aug 11, 2018
Kikobeats added a commit to microlinkhq/metascraper that referenced this issue Aug 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.