Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

file corruption: youtube, youtube-dl, or what? #411

Closed
Albretch opened this issue Aug 30, 2012 · 3 comments
Closed

file corruption: youtube, youtube-dl, or what? #411

Albretch opened this issue Aug 30, 2012 · 3 comments

Comments

@Albretch
Copy link

Apparently, if you use youtube-dl and for the naming use just the URI or the name given in the verbal description of the file the files contents differ even though their lengths are the same and they "look" the same even if you download the files one right after the other (they don't use tildes ("~") in youtube URIs):

export _YT_URI=qkSuYoexbiY
echo ${_YT_URI}

date
youtube-dl --no-overwrites --continue --no-progress --audio-format
best --max-quality mp4 -o
'%(uploader)s'"/"'%(stitle)s'"~"${_YT_URI}.'%(ext)s'
"http://www.youtube.com/watch?v=${_YT_URI}&hd=1"

date

youtube-dl --no-overwrites --continue --no-progress --audio-format
best --max-quality mp4 -o '%(uploader)s'"/"${_YT_URI}.'%(ext)s'
"http://www.youtube.com/watch?v=${_YT_URI}&hd=1"

date

$ ls -lR
.:
total 4
drwxr-xr-x 2 knoppix knoppix 4096 Aug 29 12:03 TheYoungTurks

./TheYoungTurks:
total 260528
-rw-r--r-- 1 knoppix knoppix 133389811 Jun 4 23:31
Bradley_Manning_Trial_Update~qkSuYoexbiY.mp4
-rw-r--r-- 1 knoppix knoppix 133389811 Jun 4 23:31 qkSuYoexbiY.mp4

$ md5sum -b TheYoungTurks/Bradley_Manning_Trial_UpdateqkSuYoexbiY.mp4
d7c2064ee789dd79395da15ab2bcd1b2
*TheYoungTurks/Bradley_Manning_Trial_Update
qkSuYoexbiY.mp4

$ md5sum -b TheYoungTurks/qkSuYoexbiY.mp4
d7a5937fda4ad2d8819e97e1c041b29b *TheYoungTurks/qkSuYoexbiY.mp4

Which utility would you use to check media files on a binary data level and see if the difference lies in some metadata or it is just some other kind of corruption? So the ultility must "understand" media files?

What do you think is going on here?

thanks
lbrtchx

@rg3
Copy link
Collaborator

rg3 commented Sep 4, 2012

In my short experience, yes, YouTube may serve you slightly different
files in different requests. This sometimes manifests itself in the file
size. For example, see the following code that had to be added to
correctly skip files that have already been downloaded:

https://github.com/rg3/youtube-dl/blob/master/youtube_dl/FileDownloader.py#L605

Based on that, maybe YouTube gives you different files every time and
sometimes you can see it by file size while other times it's some
metadata changes inside the file without affecting the file size. I
don't know which tool could be used to verify that.

@johlang
Copy link

johlang commented Mar 24, 2013

Scripting around a little trying to implement some cache around youtube-dl I encountered different file contents on every download I tried. my example was: http://www.youtube.com/watch?v=-K_P6G0E7oE.flv client-IP was always the same.

using cmp -l I discovered that the difference is around byte 305. here are diffs from hexdump -C -n 512 -- "$f":
first:
< 00000120 41 32 36 31 48 48 31 33 36 34 31 36 31 36 38 31 |A261HH1364161681|
< 00000130 36 39 32 34 34 37 00 00 00 00 00 00 04 70 75 72 |692447.......pur|
---
> 00000120 41 32 36 31 48 48 31 33 36 34 31 36 31 32 39 31 |A261HH1364161291|
> 00000130 32 39 35 32 34 39 00 00 00 00 00 00 04 70 75 72 |295249.......pur|

seccond:
< 00000120 41 32 36 31 48 48 31 33 36 34 31 35 32 31 35 32 |A261HH1364152152|
< 00000130 30 37 32 34 30 31 00 00 00 00 00 00 04 70 75 72 |072401.......pur|
---
> 00000120 41 32 36 31 48 48 31 33 36 34 31 35 36 36 32 34 |A261HH1364156624|
> 00000130 32 34 35 36 32 37 00 00 00 00 00 00 04 70 75 72 |245627.......pur|

Looks like some kind of tag in the metadata.

@ZainRizvi
Copy link

Seems like an obsolete issue. Shall we close this?

joedborg referenced this issue in joedborg/youtube-dl Nov 17, 2020
[pull] master from ytdl-org:master
@dirkf dirkf closed this as not planned Won't fix, can't repro, duplicate, stale May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants