Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Dumping the page source possible? #10234
Comments
|
Use |
|
Not sure how I missed these, thanks. I went through both just now and they seem to have issues (as you pointed out). |
|
Yes currently these options are for debugging only and not designed for fine-grained control. You can modify https://github.com/rg3/youtube-dl/blob/8b40854/youtube_dl/extractor/common.py#L445-L472 to achieve "setting the directory that you wanted to the save the pages to" or similar. |
|
Ah I see, but the downside is that I'd have to modify it and set it up every single time there's an update which would be an absolute pain. PS |
|
Something hacky :) At least no need to modify sources. from __future__ import unicode_literals
import json
import re
import youtube_dl
class Logger(object):
def __init__(self):
self.dumping_url = None
self.dumped_files = []
def debug(self, msg):
if self.dumping_url:
self.dumped_files.append({
'url': self.dumping_url,
'base64_data': msg,
})
self.dumping_url = None
return
mobj = re.search(r'Dumping request to (.+)', msg)
if mobj:
self.dumping_url = mobj.group(1)
def warning(self, msg):
pass
def error(self, msg):
pass
logger = Logger()
ydl_opts = {
'dump_intermediate_pages': True,
'logger': logger,
}
with youtube_dl.YoutubeDL(ydl_opts) as ydl:
ydl.extract_info('https://www.youtube.com/watch?v=cbjMwKLE-RE', download=False)
print(json.dumps(logger.dumped_files, indent=4, sort_keys=True)) |
|
Big thanks, I'll give this a try! :) |
Before submitting an issue make sure you have:
What is the purpose of your issue?
I see options that allows me to dump the user agent and JSON of the output, but is there a way to actually dump the source of the page(s) that youtube-dl went through?