Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Extractors] Add maariv.co.il extractor #8331

Merged
merged 20 commits into from Dec 18, 2023
Merged

Conversation

amir16yp
Copy link
Contributor

@amir16yp amir16yp commented Oct 12, 2023

IMPORTANT: PRs without the template will be CLOSED

Description of your pull request and other information

Adds an extractor for https://maariv.co.il, an Israeli news website.
The generic extractor does not extract videos from articles so I decided to reverse engineer the site and make my own extractor. The extractor is by no means perfect but is suitable enough for it to be used now, I'd appreciate any feedback on this before merging.

Before submitting a pull request make sure you have:

In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check all of the following options that apply:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

Copilot Summary

🤖 Generated by Copilot at 10ea327

Summary

📥📄🎥

Add support for extracting videos from maariv.co.il articles. Create a new file maariv.py that defines the MaarivIE extractor and import it in _extractors.py.

MaarivIE class
extracts videos from news
autumn of scraping

Walkthrough

  • Add a new extractor for maariv.co.il (link, link)
    • Import the MaarivIE class from maariv.py in _extractors.py (link)
    • Define the MaarivIE class in maariv.py (link)

@bashonly bashonly added the site-request Request to support a new website label Oct 13, 2023
@bashonly bashonly self-requested a review October 13, 2023 03:03
yt_dlp/extractor/maariv.py Outdated Show resolved Hide resolved
yt_dlp/extractor/maariv.py Outdated Show resolved Hide resolved
yt_dlp/extractor/maariv.py Outdated Show resolved Hide resolved
yt_dlp/extractor/maariv.py Outdated Show resolved Hide resolved
yt_dlp/extractor/maariv.py Outdated Show resolved Hide resolved
yt_dlp/extractor/maariv.py Outdated Show resolved Hide resolved
amir16yp and others added 5 commits October 19, 2023 15:41
Co-authored-by: garret <garret1317@yandex.com>
Co-authored-by: garret <garret1317@yandex.com>
Co-authored-by: garret <garret1317@yandex.com>
Co-authored-by: garret <garret1317@yandex.com>
Co-authored-by: garret <garret1317@yandex.com>
yt_dlp/extractor/maariv.py Outdated Show resolved Hide resolved
@seproDev seproDev added the pending-fixes PR has had changes requested label Nov 16, 2023
yt_dlp/extractor/maariv.py Outdated Show resolved Hide resolved
yt_dlp/extractor/maariv.py Outdated Show resolved Hide resolved
yt_dlp/extractor/maariv.py Outdated Show resolved Hide resolved
yt_dlp/extractor/maariv.py Outdated Show resolved Hide resolved
amir16yp and others added 4 commits December 12, 2023 16:37
match embed regex instead of article url

Co-authored-by: sepro <4618135+seproDev@users.noreply.github.com>
Co-authored-by: sepro <4618135+seproDev@users.noreply.github.com>
Co-authored-by: sepro <4618135+seproDev@users.noreply.github.com>
Co-authored-by: sepro <4618135+seproDev@users.noreply.github.com>
@amir16yp
Copy link
Contributor Author

@seproDev I merged your suggestions. I ran a test download, while it downloads and lists formats just fine, however it fails the test
here's a log of the command

sneed@pc-~/Desktop/yt-dlp-> python test/test_download.py TestDownload.test_Maariv_all
E
======================================================================
ERROR: test_Maariv_all (__main__.TestDownload):
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/sneed/Desktop/yt-dlp/test/test_download.py", line 298, in test_template
    getattr(self, test_name)()
  File "/home/sneed/Desktop/yt-dlp/test/test_download.py", line 113, in test_template
    raise Exception(f'Test {tname} definition incorrect - "ext" key must be present to define the output file')
Exception: Test test_Maariv definition incorrect - "ext" key must be present to define the output file

----------------------------------------------------------------------
Ran 1 test in 0.000s

FAILED (errors=1)

I have no idea what this means exactly? what do i need to change?
also there's some merge conflicts with _extractors.py due to it being updated since
@bashonly

@bashonly
Copy link
Member

bashonly commented Dec 12, 2023

@amir16yp I fixed some things. Pull and try again
python test/test_download.py TestDownload.test_Maariv
python test/test_download.py TestDownload.test_Maariv_webpage

the tests will fail at first, but should output the missing info_dict fields that you can copy/paste

yt_dlp/extractor/maariv.py Outdated Show resolved Hide resolved
yt_dlp/extractor/maariv.py Outdated Show resolved Hide resolved
@seproDev seproDev added pending-review PR needs a review and removed pending-fixes PR has had changes requested labels Dec 17, 2023
@bashonly bashonly removed the pending-review PR needs a review label Dec 17, 2023
@seproDev seproDev merged commit c5f01bf into yt-dlp:master Dec 18, 2023
14 of 15 checks passed
aalsuwaidi pushed a commit to aalsuwaidi/yt-dlp that referenced this pull request Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
site-request Request to support a new website
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants