Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Conflict between --dump-pages and --dump-json #5787
Comments
|
|
How will users know which page was downloaded first? For example: |
|
You can't, but since I gues you are not interested on the DASH manifest you can just look for the |
|
What about other sites? |
|
You'll have to analyse case by case, you can't just use the same approach for all sites because the downloaded pages vary a lot (they can be: html, json, xml, javascript or some custom text format). Before you start doing it, you should really consider if you need to do it and if some of the info you need could be directly extracted by youtube-dl. |
|
If the JSON dump provided a simple value which showed if the signature was encrypted from the start then the problem would be solved. Something as simple as The reason I want this is because I want to log the amount of videos that have encrypted signature and in some cases also restrict them from being downloaded. |
Open a new issue, as I said in #5781 (comment) |
|
I didn't see your reply. |
I'm trying to both dump the page source and the JSON in one single command.
Example:
youtube-dl --dump-pages --dump-json URLI am however only getting the output of
--dump-json, but not the output of--dump-pages.There are currently two solutions that I am able to think of:
Run two commands; one getting the result of
--dump-jsonand the other--dump-pages. The large downside to this solution is that all requests are going to be doubled as the sources would have to be grabbed twice.Using
--write-pagesworks perfectly together with--dump-json, but the downside to this option is that the pages are saved to a folder which requires a script that goes through the outputted files and opens them. The other part which makes things really difficult is the ability to distinguish the order of the pages with--write-pagesas some sites (YouTube for example) require additional requests (manifests).Since the files all have the same creation time, it's hard to know which file was requested first which would require a regex making the issue even more complicated.