Add the option to return only the info_dict #296

Closed
cochiseruhulessin opened this Issue Feb 27, 2012 · 5 comments

Comments

Projects
None yet
4 participants

When _real_extract() is called, it immediately sends the parsed info to it's FileDownloader.

To expand the usability of youtube_dl, it would be great if the caller could only retrieve the parsed video info and do whatever he pleases with it.

A quick look at the code shows that this could be done by adding a, for example, "data_only" argument to InfoExtractor.extract(),
which is a bool indicating if we should download, or only parse the info. This is then passed to _real_extract().

See this example for init.py, line 1450-1467. It assumes _real_extract() got the "data only" argument.

            try:
                # Process video information
                data = {
                    'id':       video_id.decode('utf-8'),
                    'url':      video_real_url.decode('utf-8'),
                    'uploader': video_uploader.decode('utf-8'),
                    'upload_date':  upload_date,
                    'title':    video_title,
                    'stitle':   simple_title,
                    'ext':      video_extension.decode('utf-8'),
                    'format':   (format_param is None and u'NA' or format_param.decode('utf-8')),
                    'thumbnail':    video_thumbnail.decode('utf-8'),
                    'description':  video_description,
                    'player_url':   player_url,
                }
                if data_only:
                    return data
                self._downloader.process_info(data)
            except UnavailableVideoError, err:
                self._downloader.trouble(u'\nERROR: unable to download video')
Collaborator

FiloSottile commented Mar 6, 2012

It might be clearer to separate the info extracting and the passing to the downloader. So for example real_extract returns the info dict and extract passes it to the downloader.

after studying the code a bit i came to the same conclusion!

Collaborator

FiloSottile commented Mar 30, 2012

Actually the docstring for extract() suggest that behavior:
"""Extracts URL information and returns it in list of dicts."""
The download should be done by FileDownloader.process_info()

Collaborator

FiloSottile commented Mar 30, 2012

Hmm, the current implementation is insane, particularly for the randomly positioned increment_downloads. We should definitively put all that stuff in FileDownloader.download().
It also helps the API-fication.

FiloSottile added a commit to FiloSottile/youtube-dl that referenced this issue Mar 30, 2012

moved increment_downloads and process_info calls from IEs to FD.downl…
…oad (#296) (follows current doclines); a small step towards importability #217

@ghost ghost assigned jaimeMF Apr 29, 2013

Collaborator

jaimeMF commented May 15, 2013

All the InfoExtractors now returns an info_dict.

@jaimeMF jaimeMF closed this May 15, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment