Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address changed for smotrim.ru #31647

Open
5 of 6 tasks
betterthanever2 opened this issue Feb 22, 2023 · 19 comments
Open
5 of 6 tasks

Address changed for smotrim.ru #31647

betterthanever2 opened this issue Feb 22, 2023 · 19 comments

Comments

@betterthanever2
Copy link

betterthanever2 commented Feb 22, 2023

Checklist

  • I'm reporting a broken site support issue
  • I've verified that I'm running youtube-dl version 2021.12.17
  • I've checked that all provided URLs are alive and playable in a browser
  • I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • I've searched the bugtracker for similar bug reports including closed ones
  • I've read bugs section in FAQ

Description

Right now, trying to download a video from an address like https://smotrim.ru/video/2568723 results in a timeout error due to failure to resolve the player address.
This is likely because resolved video address used to have this format: http://player.rutv.ru/iframe/datavideo/id/2568723, but seems to have been changed recently to https://player.smotrim.ru/iframe/video/id/2568723.

Correction: the issue is actually with the JSON data file URL, which is now https://player.smotrim.ru/iframe/datavideo/id/2568924/sid/smotrim, i.e. features smotrim.ru instead of rutv.ru, and has the /sid/smotrim part at the end.

@dirkf
Copy link
Contributor

dirkf commented Feb 22, 2023

There is no specific support for smotrim.ru in either the released yt-dl or the latest git master. Some time ago we discovered that the rutube.ru extractor would also handle the site.

Maybe you are using yt-dlp, which does have an extractor that now times out? See #30839.

Background

@betterthanever2
Copy link
Author

smotrim.ru is handled by the RUTV extractor. A few months back I submitted an issue here about it not being supported explicitly, and eventually a patch was suggested, and it still works for me.
I changed the extractor script to handle the above issue as well, so downloads work fine for me right now. I just thought, I'd let you know about this change.

@Vangelis66
Copy link

Vangelis66 commented Feb 22, 2023

I changed the extractor script to handle the above issue as well

... Perhaps, then, you'd be kind and willing 😉 to share this newest patch of yours here or, even kinder, create a PR with the necessary changes to rutv.py that restore smotrim.ru support in youtube-dl?

Thanks 😃

@dirkf
Copy link
Contributor

dirkf commented Feb 23, 2023

Sure, if you're using a patched version, that's understandable, but you could have noted that against "verified that I'm running youtube-dl version 2021.12.17".

Perhaps submit your additional patch so that we can roll it in due course, or into yt-dlp?

@Vangelis66
Copy link

Vangelis66 commented Feb 23, 2023

OP is located inside Ukraine...
Living in the EU, I find I'm unable to access:
https://smotrim.ru/video/2568723
[Unable to connect error in my browser]
so I assume this applies...

However,
http://player.rutv.ru/iframe/datavideo/id/2568723
DOES load OK here (the opposite to OP), while
https://player.smotrim.ru/iframe/video/id/2568723
and
https://player.smotrim.ru/iframe/datavideo/id/2568924/sid/smotrim
DO NOT (to be expected, due to EU-wide block of smotrium.ru)

OTOH, When I VPN to Russia,
https://smotrim.ru/video/2568723
DOES load now, as well as
https://player.smotrim.ru/iframe/video/id/2568723
and
https://player.smotrim.ru/iframe/datavideo/id/2568924/sid/smotrim
but
http://player.rutv.ru/iframe/datavideo/id/2568723
DOES NOT (it times out, consistent with OP's report) ...

At the current state of world affairs, it seems any eventual support for smotrim.ru (via rutv.py) would be IP-dependent... 😞

@betterthanever2
Copy link
Author

I changed the extractor script to handle the above issue as well

... Perhaps, then, you'd be kind and willing 😉 to share this newest patch of yours here or, even kinder, create a PR with the necessary changes to rutv.py that restore smotrim.ru support in youtube-dl?

Thanks 😃

The patch is replacing line 155 'http://player.rutv.ru/iframe/data%s/id/%s' % ('live' if is_live else 'video', video_id) with f'http://player.smotrim.ru/iframe/data{"live" if is_live else "video"}/id/{video_id}/sid/smotrim'. I am wary about submitting this as a PR, because I'm pretty sure this won't work universally.

@betterthanever2
Copy link
Author

OP is located inside the Ukraine...
not that big of a deal, but Ukraine goes without an article.

@Vangelis66
Copy link

Vangelis66 commented Feb 23, 2023

The patch is replacing line 155

Which "line 155" ?

If one patches rutv.py (current state in git master) according to this patch by dirkf, one can't find L155 with cited content 😞 ... There's now L160 that appears to be relevant:

159        json_data = self._download_json(
160            '%s/iframe/data%s/id/%s' % (player, 'live' if is_live else 'video', video_id),
161            video_id, 'Downloading JSON')

so, perhaps, this one needs to be patched?
IOW, what's the file you applied your posted patch on?

not that big of a deal, but Ukraine goes without an article.

Noted and corrected 😉 ; I must have been (subconsciously) carried away by
... inside the UK,
that I often write myself 😄 ...

Regards.

@betterthanever2
Copy link
Author

betterthanever2 commented Feb 23, 2023

The patch is replacing line 155

so, perhaps, this one needs to be patched? IOW, what's the file you applied your posted patch on?

Yes, the line you reference is the one. The patch by dirkf is the one I applied, I may have done that manually, and that may have resulted in a different number of lines in the file. Quite honestly, I don't remember, I just made it work and moved on.

@Vangelis66
Copy link

Vangelis66 commented Feb 23, 2023

... Right... It was a case of finding a working and whitelisted RU HTTPS proxy 😉 ...
As I posted above, I simply patched the git master edition of rutv.py with the linked patch by dirkf - I did NOT patch anything else in RUTVIE ...

yt-dl --proxy "localhost:8080" -vF "https://smotrim.ru/video/2568723" => 

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--ffmpeg-location', '.\\FFmpeg', '--external-downloader-args', '-v 8 -stats', '--proxy', 'localhost:8080', '-vF', 'https://smotrim.ru/video/2568723']
[debug] Encodings: locale cp1253, fs mbcs, out cp737, pref cp1253
[debug] youtube-dl version 2023.02.23.114514
[debug] Python version 3.4.4 (CPython) - Windows-Vista-6.0.6003-SP2
[debug] exe versions: ffmpeg n5.2-dev-2245-N-109649-gab8cde6, ffprobe n5.2-dev-2245-N-109649-gab8cde6, phantomjs 2.1.1, rtmpdump 2.4
[debug] Proxy map: {'https': 'localhost:8080', 'http': 'localhost:8080'}
[RUTV] 2568723: Downloading JSON
[RUTV] 2568723: Downloading m3u8 information
[info] Available formats for 2568723:
format code  extension  resolution note
hls-400      mp4        unknown     400k
hls-800      mp4        unknown     800k
hls-1200     mp4        unknown    1200k
hls-1800     mp4        unknown    1800k
hls-4050     mp4        unknown    4050k
http-1080    mp4        1920x1080
http-234     mp4        1920x1080
http-360     mp4        1920x1080
http-540     mp4        1920x1080
http-720     mp4        1920x1080  (best)

The extractor has to be amended, so that resolutions are:
a) calculated for the HLS formats (currently unknown)
b) corrected for the HTTP formats (all showing as 1920x1080;hint: width is derived from the format code, e.g. http-360 => 640x360)

Additionally, the (best) label should be assigned to http-1080 rather than to http-720; and, though advertised as available, format http-234 always returns a 404...

The media files AREN'T BLOCKED per se, at least not here...

yt-dl --proxy "localhost:8080" -f http-360 "https://smotrim.ru/video/2568723" -g => 

https://cdn-v.rtr-vesti.ru/_cdn_auth/secure/v/vh/mp4/medium-wide/002/740/306.mp4?auth=mh&vid=2740306

and then one can perform a DIRECT download:

yt-dl "https://cdn-v.rtr-vesti.ru/_cdn_auth/secure/v/vh/mp4/medium-wide/002/740/306.mp4?auth=mh&vid=2740306" -o "Кто против Одно из самых масштабных посланий президента Федеральному Собранию. Эфир от 21.02.2023-2568723.mp4" => 

[generic] 306: Requesting header
[download] Destination: Кто против Одно из самых масштабных посланий президента Федеральному Собранию. Эфир от 21.02.2023-2568723.mp4
[download] 100% of 271.86MiB in 07:39

@dirkf
Copy link
Contributor

dirkf commented Feb 24, 2023

If the HLS media URLs are like the mp4 ones, a descriptive resolution can be extracted from the URL itself.

OT: historically the word "Ukraine", which is etymologically equivalent to Borders or Marches, was used in English for the region of the Russian Tsarist/Soviet empire where the modern independent country Ukraine is situated; that is the significance of "the Ukraine". A place might be inside the RF and not in Ukraine, but still be in "the Ukraine".

@betterthanever2
Copy link
Author

A place might be inside the RF and not in Ukraine, but still be in "the Ukraine".

Sorry, what?

Anyway, word "Ukraine" has an etymology, that is true, the same goes for almost any country name. What you mentioned about "borders or marches" is one hypothesis in this case (not an established fact, mind you), but more importantly, contemporary usage of the words is very rarely tied to their etymology. The world would be a weird place, if it was. So why don't we just agree to call the countries whatever their respective people have implicitly agreed upon, and leave it at that?

This is hardly a place for such discussions.

@Vangelis66

This comment was marked as off-topic.

@dirkf

This comment was marked as off-topic.

@dirkf

This comment was marked as off-topic.

@betterthanever2
Copy link
Author

betterthanever2 commented Feb 24, 2023

The established usage and distinction in English that I described (more accurately than Wikipedia IMO) isn't a hypothesis.

Well, it is a hypothesis, albeit a commonly accepted one. I just don't think, there's such thing as a 100% fact with these things. It's not a matter of record, after all.

The use of "the Ukraine" is becoming rarer, if only and horrifically because the invasion has led English speakers to focus on the nation rather than the general area.

Quite honestly, I have never heard anybody referring to the 'general area' that way, not in English, not in Russian, and not in Ukrainian, so this "becoming rarer" feels... 🤨 How old are you, exactly? 😄

@dirkf
Copy link
Contributor

dirkf commented Feb 24, 2023

From the UK, all the variants player.rutv.ru, together with player.smotrim.ru and player.vgtrk.com (same IP), resolve but respond to pings only occasionally, if at all, at least until the forthcoming Untergang.

The patched code can be modified at the 4th line of the _real_extract() method:

        player = 'http://player.smotrim.ru'

Possibly --geo-verification-proxy ... works?

@Vangelis66
Copy link

Vangelis66 commented Feb 26, 2023

Possibly --geo-verification-proxy ... works?

It doesn't:

yt-dl -vF --geo-verification-proxy "localhost:8080" "https://smotrim.ru/video/2568723" => 

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--ffmpeg-location', '.\\FFmpeg', '--external-downloader-args', '-v 8 -stats', '-vF', '--geo-verification-proxy', 'localhost:8080','https://smotrim.ru/video/2568723']
[debug] Encodings: locale cp1253, fs mbcs, out cp737, pref cp1253
[debug] youtube-dl version 2023.02.25.334
[debug] Python version 3.4.4 (CPython) - Windows-Vista-6.0.6003-SP2
[debug] exe versions: ffmpeg n5.2-dev-2245-N-109649-gab8cde6, ffprobe n5.2-dev-2245-N-109649-gab8cde6, phantomjs 2.1.1, rtmpdump 2.4
[debug] Proxy map: {}
[RUTV] 2568723: Downloading JSON
ERROR: Unable to download JSON metadata: <urlopen error [WinError 10061] No connection could be made because the target machine actively refused it> (caused by URLError(ConnectionRefusedError(10061, 'No connection could be made because the target machine actively refused it', None, 10061, None),))
  File "common.py", line 635, in _request_webpage
  File "YoutubeDL.py", line 2300, in urlopen
  File "C:\hostedtoolcache\windows\Python\3.4.4\x86\lib\urllib\request.py", line 464, in open
  File "C:\hostedtoolcache\windows\Python\3.4.4\x86\lib\urllib\request.py", line 482, in _open
  File "C:\hostedtoolcache\windows\Python\3.4.4\x86\lib\urllib\request.py", line 442, in _call_chain
  File "D:\a\youtube-dl\youtube-dl\youtube_dl\utils.py", line 2634, in http_open
  File "C:\hostedtoolcache\windows\Python\3.4.4\x86\lib\urllib\request.py", line 1185, in do_open

... and the reason it doesn't work I have explained in the past (in the context of a French TV InfoExtractor I'm now lazy to dig up): the InfoExtractor it's used with has to explicitly support that switch...
Supporting it inside the IE means passing once (or multiple times), in the "right" place(s), an additional request header in the form of

headers=self.geo_verification_headers()

E.g. theplatformIE, a "framework" used in many other (mainly American) IEs, has this part of code:

def _extract_theplatform_smil(self, smil_url, video_id, note='Downloading SMIL data'):
meta = self._download_xml(
smil_url, video_id, note=note, query={'format': 'SMIL'},
headers=self.geo_verification_headers())

--geo-verification-proxy is particular to IEs where:
a) the playlist/manifest generating API is geo-blocked and can't be fooled by an x-Forwarded-For header (this includes the --geo-bypass and --geo-bypass-country switches)
b) the stream CDN(s) do not geo-block.

In this case, the webpage itself is being blocked for me, so I expect the "extra" code has to be applied in the webpage fetch itself, not just for the "manifest" API... 😞

@Vangelis66
Copy link

OK, based on my analysis above, and a lot of trial-and-error, I concocted a version of rutv.py that works for me with the --geo-verification-proxy switch and my HTTPS-only Russian proxy:

142    def _real_extract(self, url):
143        mobj = re.match(self._VALID_URL, url)
144        video_id = mobj.group('id')
145        video_path = mobj.group('path')
146-        player = 'http://player.smotrim.ru'
146+        player = 'https://player.smotrim.ru'
147        
148        if video_path.startswith('iframe'):
149            video_type = mobj.group('type')
150            if video_type == 'swf':
151                video_type = 'video'
152        elif video_path.startswith('index/iframe/cast_id'):
153            video_type = 'live'
154        else:
155            video_type = 'video'
156
157        is_live = video_type == 'live'
158
159        json_data = self._download_json(
160            '%s/iframe/data%s/id/%s' % (player, 'live' if is_live else 'video', video_id),
161-            video_id, 'Downloading JSON')
161+            video_id, 'Downloading JSON',
162+            headers=self.geo_verification_headers())

I had to explicitly set the HTTPS version of the player, else my proxy couldn't handle the redirection from HTTP to HTTPS (???):

yt-dl -vF --geo-verification-proxy "https://localhost:8080" "https://smotrim.ru/video/2568723" => 

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--ffmpeg-location', '.\\FFmpeg', '--external-downloader-args', '-v 8 -stats', '-vF', '--geo-verification-proxy', 'https://localhost:8080', 'https://smotrim.ru/video/2568723']
[debug] Encodings: locale cp1253, fs mbcs, out cp737, pref cp1253
[debug] youtube-dl version 2023.02.25.334
[debug] Python version 3.4.4 (CPython) - Windows-Vista-6.0.6003-SP2
[debug] exe versions: ffmpeg n5.2-dev-2245-N-109649-gab8cde6, ffprobe n5.2-dev-2245-N-109649-gab8cde6, phantomjs 2.1.1, rtmpdump 2.4
[debug] Proxy map: {}
[RUTV] 2568723: Downloading JSON
[RUTV] 2568723: Downloading m3u8 information
[info] Available formats for 2568723:
format code  extension  resolution note
hls-400      mp4        unknown     400k
hls-800      mp4        unknown     800k
hls-1200     mp4        unknown    1200k
hls-1800     mp4        unknown    1800k
hls-4050     mp4        unknown    4050k
http-1080    mp4        1920x1080
http-234     mp4        1920x1080
http-360     mp4        1920x1080
http-540     mp4        1920x1080
http-720     mp4        1920x1080  (best)

and then:

yt-dl --geo-verification-proxy "https://localhost:8080" -f http-540 "https://smotrim.ru/video/2568723" => 

[RUTV] 2568723: Downloading JSON
[RUTV] 2568723: Downloading m3u8 information
[download] Destination: Кто против Одно из самых масштабных посланий президента Федеральному Собранию. Эфир от 21.02.2023-2568723.mp4
[download]   4.7% of 528.15MiB at 693.60KiB/s ETA 12:22

NB that a dl speed of ca. 700KiB/s is due to the DIRECT connection 😜 ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants