Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does --download-archive work with either --dump-json or --flat-playlist? #23612

Closed
lihuelworks opened this issue Jan 3, 2020 · 4 comments
Closed
Labels

Comments

@lihuelworks
Copy link

@lihuelworks lihuelworks commented Jan 3, 2020

Checklist

  • I'm asking a question
  • I've looked through the README and FAQ for similar questions
  • I've searched the bugtracker for similar questions including closed ones

Question

I'm writing a Powershell script that outputs a list of titles and channels of a given playlist. The neatest way that I found was using --dump-json or -j on a file, then grabbing the titles (.fulltitle) and channel names (.uploader) with jq to generate another file, like this:

./youtube-dl.exe -i -j $url | jq -j '.fulltitle + \" - \" + .uploader + \"\n\"' >> titles.log which outputs something like this:

Blowing up my house - awesomeprank65
My house is gone, and my kickstarter - awesomeprank65

Thing is, when I'm trying to implement --download-archive, to skip those files I've already listed and have sent to titles.log, the command just gives the same output, reading all the files and adding them to the bottom...again. Here are more details about it:

  • I'm using the ">>" on purpose, otherwise the file would be overwritten and the content gone- This is needed in the case a video gets deleted, and I don't have access to it's title anymore.
  • The problem is not related to the aforementioned script, since trying youtube-dl -j --download-archive archive.txt $URL or youtube-dl --flat-playlist --download-archive archive.txt $URL then adding a videos and running the command again gives the same results, listing ALL of the videos in a playlist in both cases. (In the second, with the videos added).
  • In both cases, archive.txt was never created, nor in the testing folder, nor in youtube-dl's folder.
  • Using latest version, (2020.01.01)
  • Edit: Downloading the videos with --download-archive DOES creates the archive.txt file, then testing ./youtube-dl.exe -i -j $url | jq -j '.fulltitle + \" - \" + .uploader + \"\n\"' >> titles.log skips the first videos as initially intended.

So my questions are:

  • Is --download-archive implemented in such a way that ONLY having a file of the video makes it record it's ID to archive.txt? (i.e the output of --download-archive)
  • Is there any way to skip downloaded files if that was the case?
  • Does order matter when using --download archive and -j or --flat-playlist? Would putting one first make the command work as intended?
@lihuelworks lihuelworks added the question label Jan 3, 2020
@remitamine
Copy link
Collaborator

@remitamine remitamine commented Jan 3, 2020

Is --download-archive implemented in such a way that ONLY having a file of the video makes it record it's ID to archive.txt? (i.e the output of --download-archive)

as you can read in the option description:
Record the IDs of all downloaded videos in it.

Is there any way to skip downloaded files if that was the case?

you can create the archive file the same way you're creating titles.log.

Does order matter when using --download archive and -j or --flat-playlist? Would putting one first make the command work as intended?

no, it doesn't matter.

@remitamine remitamine closed this Jan 3, 2020
@lihuelworks
Copy link
Author

@lihuelworks lihuelworks commented Jan 3, 2020

Using the previously made titles.log doesn't skip the files, instead it throws error: ERROR: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte.

Changing the encoding to UTF-8 allows youtube-dl to read it, but doesn't do anything in the skipping videos department. It just lists all the videos again, only difference being the file having all characters spaced out:
Screenshot_1

Edit: Trying to open the file with another editor that's not Notepad (e.g Visual Studio Code or Notepad++) gives an encoding error or displays broken characters.
Edit2: The error might be related to how Powershell handles encoding. I'm changing the default encoding to UTF-8 with no BOM to see results. Changing to UTF8 or UTF8 with no BOM doesn't avoids the UTF8 error, but doesn't make --download-archive work either.

@remitamine
Copy link
Collaborator

@remitamine remitamine commented Jan 3, 2020

Of course, it won't work, the download archive format is different from the tiltles.log file you're using, what I mean by the same way is that you can generate it by using the info JSON returned with -j option.

@lihuelworks
Copy link
Author

@lihuelworks lihuelworks commented Jan 3, 2020

That's solved. I can create an archive file that makes --download-arhive work, but I can't fix the ERROR: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte when youtube-dl tries to read the file Powershell just ouputted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.