Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

7Plus (Australia) extractor doesn't provide series name #15862

Closed
section83 opened this issue Mar 14, 2018 · 6 comments
Closed

7Plus (Australia) extractor doesn't provide series name #15862

section83 opened this issue Mar 14, 2018 · 6 comments

Comments

@section83
Copy link

@section83 section83 commented Mar 14, 2018

Please follow the guide below

  • You will be asked some questions and requested to provide some information, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your issue (like this: [x])
  • Use the Preview tab to see what your issue will actually look like

Make sure you are using the latest version: run youtube-dl --version and ensure your version is 2018.03.10. If it's not, read this FAQ entry and update. Issues with outdated version will be rejected.

  • I've verified and I assure that I'm running youtube-dl 2018.03.10

Before submitting an issue make sure you have:

  • At least skimmed through the README, most notably the FAQ and BUGS sections
  • Searched the bugtracker for similar issues including closed ones
  • Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser

What is the purpose of your issue?

  • Site support request (request for adding support for a new site) <= change to existing extractor.

Description of your issue, suggested solution and other information

The site 7Plus.com.au is a new design. The site no longer provides series names in the normal way. Instead, series names are stored in the web page title ie. between <title> </title> pairs with the 7Plus name added. For example the series name for all episodes of the Mighty Planes TV show is stored as: <title>Might Planes | 7Plus</title>. This is the only location easily found for the series name. I have tried other means including options, output template parameters and parsing the URL without success. The series name is not in the JSON data.

An example web page is: URL https://7plus.com.au/MTYS?episode-id=MTYS7-003

Can the 7Plus extractor be updated to enable youtube-dl to output the series name using the web page title ?

Thanks.

@kayb94
Copy link
Contributor

@kayb94 kayb94 commented Mar 15, 2018

7plus seems to be geo restricted. I tried with both TOR and some proxys I found, but was not able to get access, I always get "HTTP Error 403".
Anyway, since you very clearly described, what is necessary, I made a commit on my own repo. Please try it out, I can't test it all. Do so with the following commands:

git clone https://github.com/kayb94/youtube-dl -b 7plus15862 --depth 1
cd youtube_dl
python -m youtube_dl "https://7plus.com.au/MTYS?episode-id=MTYS7-003" -v

And then tell me, whether it worked, or not. :)

@section83
Copy link
Author

@section83 section83 commented Mar 15, 2018

Many thanks for this. The resulting download has this name: S7 E3 - Wind Surf-MTYS7-003.mp4 which doesn't include the series name. Here's the verbose response:

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'https://7plus.com.au/MTYS?episode-id=MTYS7-003', u'-v']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2018.03.14
[debug] Git HEAD: 4b56911
[debug] Python version 2.7.10 (CPython) - Darwin-16.7.0-x86_64-i386-64bit
[debug] exe versions: ffmpeg 3.4.2-tessus, ffprobe 3.4.2-tessus, rtmpdump 2.4
[debug] Proxy map: {}
[7plus] MTYS7-003: Downloading JSON metadata
[7plus] MTYS7-003: Downloading m3u8 information
[7plus] MTYS7-003: Downloading m3u8 information
[7plus] MTYS7-003: Downloading MPD manifest
[7plus] MTYS7-003: Downloading MPD manifest
[7plus] MTYS7-003: Downloading JSON metadata
[debug] Default format spec: bestvideo+bestaudio/best
[debug] Invoking downloader on u'https://manifest.prod.boltdns.net/manifest/v1/hls/v5/clear/5303576322001/51e4e57d-dbe9-4d83-8b34-40a3e4be2dda/ba050c29-69f9-42fb-bed3-b57718388a88/10s/rendition.m3u8?fastly_token=NWFhYTAzYzlfM2E2ODRjM2Q3NTMyOTYxMWIyZWUzNTc0NDM5ZjUzZWI1MmM2NDIzMTAyNmIxZjBmOWJjOTg3NzlmNjM4MjliNw%3D%3D'
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 277
[download] Destination: S7 E3 - Wind Surf-MTYS7-003.fhls-6789-1.mp4
[download] 100% of 1.97GiB in 07:31
[debug] Invoking downloader on u'https://manifest.prod.boltdns.net/manifest/v1/dash/live-hbbtv15/clear/5303576322001/51e4e57d-dbe9-4d83-8b34-40a3e4be2dda/2s/'
[dashsegments] Total fragments: 1393
[download] Destination: S7 E3 - Wind Surf-MTYS7-003.fdash-4077f4e1-ffb9-47d2-9f4c-3d8a531d40fe-1.m4a
[download] 100% of 64.19MiB in 02:14
[ffmpeg] Merging formats into "S7 E3 - Wind Surf-MTYS7-003.mp4"
[debug] ffmpeg command line: ffmpeg -y -i 'file:S7 E3 - Wind Surf-MTYS7-003.fhls-6789-1.mp4' -i 'file:S7 E3 - Wind Surf-MTYS7-003.fdash-4077f4e1-ffb9-47d2-9f4c-3d8a531d40fe-1.m4a' -c copy -map '0:v:0' -map '1:a:0' 'file:S7 E3 - Wind Surf-MTYS7-003.temp.mp4'
Deleting original file S7 E3 - Wind Surf-MTYS7-003.fhls-6789-1.mp4 (pass -k to keep)
Deleting original file S7 E3 - Wind Surf-MTYS7-003.fdash-4077f4e1-ffb9-47d2-9f4c-3d8a531d40fe-1.m4a (pass -k to keep)

In the extractor, there is this:

    if info['title'] is None:
            webpage = self._download_webpage(url, episode_id)
            info['title'] = self._search_regex(r'<title>([^>]+)</title>', webpage, 'title')

In that code, is the "title" sought the same as the Video title used in the ytdl output template ? 7Plus always has a video title. But, they don't provide the series title. Thus:

youtube-dl --get-filename -o '%(series)s-%(title)s.%(ext)s' "https://7plus.com.au/MIPL?episode-id=MIPL03-001"

results in: NA-S3 E1 - CP140 Aurora.mp4

But, they do put the series title in the web page title together with their own brand (ie. <title>Mighty Planes | 7Plus</title>). The series title is that text trimmed on the right by 8 characters.

They don't have season and episode numbers in metadata either but, at least, they are in the title and so get-able.

Cheers.

@kayb94
Copy link
Contributor

@kayb94 kayb94 commented Mar 15, 2018

I'm not sure, if I get you right. If the text between <title> and </title> includes "| 7Plus", shouldn't the result of
youtube-dl --get-filename -o '%(series)s-%(title)s.%(ext)s' "https://7plus.com.au/MIPL?episode-id=MIPL03-001"
be NA-S3 E1 - CP140 Aurora \| 7Plus.mp4 or similar?
Did my code change the result? Because if so, then before there was no title at all (if info['title'] is None:).
It's a shame, I can't access the website.

@section83
Copy link
Author

@section83 section83 commented Mar 15, 2018

If the text between <title> and </title> includes "| 7Plus", shouldn't the result of
youtube-dl --get-filename -o '%(series)s-%(title)s.%(ext)s' "https://7plus.com.au/MIPL?episode-id=MIPL03-001"
be NA-S3 E1 - CP140 Aurora | 7Plus.mp4 or similar?

The text " | 7Plus" is part of the web page title but, not part of the series name or file title. We don't need that in the file name – it's the name of the web site and so ideally, would be trimmed off.

In your example, it would help if the result was: Mighty Planes-S3 E1 - CP140 Aurora.

I did more tests today with a different show. Here is what happened:

python -m youtube_dl -v --get-filename -o '%(season)s-%(season_number)s-%(title)s.%(ext)s' "https://7plus.com.au/MDAY?episode-id=MDAY5-001"

resulted in: NA-NA-S5 E1 - Invisible Killer.mp4

python -m youtube_dl -v --get-title -o '%(season)s-%(season_number)s-%(title)s.%(ext)s' "https://7plus.com.au/MDAY?episode-id=MDAY5-001"

resulted in: S5 E1 - Invisible Killer

I have been hoping that the extractor can provide the "series" parameter so that the result of the first test would be Air Crash Investigations-NA-S5 E1 - Invisible Killer.mp4. [I would drop the season number].

I don't have experience in overcoming geo-blocks. But, I believe a VPN which ended within Australia would work. I did testing recently on downloading from ITV and found that youtube-dl does something with proxies which enabled me to finish the testing. Will that work on 7Plus ? In case it's useful, I've attached a copy of an example web page in a couple of formats.
Air Crash Investigations | 7plus.Firefox-webpagecomplete.zip
Air Crash Investigations | 7plus.Safari-webarchive.zip

I should disclose that I'm developing a GUI youtube-dl front-end for macOS. It's just a retirement project in AppleScript. All my code is freely available with the applet.

Cheers.

@kayb94
Copy link
Contributor

@kayb94 kayb94 commented Mar 17, 2018

Hi,

I think, I achieved what you looked for, see the new commit. Honestly, at first I simply missunderstood you. ^^ Please try it out. However, I used the following command line, no other relevant informat is extracted seperately as far as I can see.
python -m youtube_dl -v -o '%(series)s - %(title)s.%(ext)s' "https://7plus.com.au/MDAY?episode-id=MDAY5-001"
Trying to extract the season and episode number wouldn't be feasable, I think, because both are only present in the normal title (alltogether, it's "S5 E1 - Invisible Killer").

I also added a test. Did you know the already onging youtube-dl GUI project? I never tried it and basically only know, that it exists, but maybe its helpful for your work!

Ref: kayb94/youtube-dl@4db0556

Regards

@section83
Copy link
Author

@section83 section83 commented Mar 18, 2018

Yes, that's done it. Many thanks. the resulting file is called:

Air Crash Investigations - S5 E1 - Invisible Killer.mp4

Yes, season and episode numbers are often in the title and not as meaningful as the series name.

I did do some research on GUIs for macOS. There are other GUIs written in AppleScript or Automator but, they usually require manual updates to youtube-dl and ffmpeg. One always installed Python even though Python is standard in macOS. Some had faults that got in the way while others lacked functions I wanted. There are some good cross-platform efforts. One called youtube-dl-gui requires wxPython. It looks well designed but I wanted something more native to macOS.

Cheers.

@dstftw dstftw closed this in d9e2240 Mar 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
@section83 @kayb94 and others
You can’t perform that action at this time.