Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ie/EuroParlWebstream] Support new URL format #9647

Merged
merged 8 commits into from
May 11, 2024
Merged

Conversation

voidful
Copy link
Contributor

@voidful voidful commented Apr 8, 2024

IMPORTANT: PRs without the template will be CLOSED

Description of your pull request and other information

update regex to support latest link:
latest format: https://multimedia.europarl.europa.eu/en/webstreaming/20240320-1345-SPECIAL-PRESSER
original format: https://multimedia.europarl.europa.eu/en/webstreaming/committee-on-culture-and-education_20230301-1130-COMMITTEE-CULT

Template

Before submitting a pull request make sure you have:

In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check all of the following options that apply:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?


update regex to support latest link
@bashonly bashonly added the site-bug Issue with a specific website label Apr 8, 2024
https?://multimedia\.europarl\.europa\.eu/[^/#?]+/
(?:(?!video)[^/#?]+/[\w-]+_)(?P<id>[\w-]+)
'''
_VALID_URL = r'''(?x)https?://multimedia\.europarl\.europa\.eu/(?:(?:[^/#?]+/)*[\w-]+/)?(?:(?!video)[^/#?]+/)?(?:[\w-]+_)?(?P<id>[\w-]+)'''
Copy link
Member

@bashonly bashonly Apr 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changing the formatting makes it very difficult to see what the diff is, and now the line is too long and (?x) is unnecessary. please revert

also, please add a test for the the new url format

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @bashonly,

Thank you for your feedback. I understand your concern regarding the updated regular expression format making it difficult to discern the differences.

I'll revert the changes to maintain the original formatting for easier comparison. Additionally, I will ensure to include a test case that covers the new URL format to verify its compatibility and functionality within the extractor.

Best regards.

@bashonly bashonly added the pending-fixes PR has had changes requested label Apr 8, 2024
@seproDev seproDev changed the title Update europa.py [ie/EuroParlWebstream] Support new URL format Apr 20, 2024
@seproDev
Copy link
Member

seproDev commented May 9, 2024

I cleaned this up and rewrote the regex. Please check that all URL formats that should be matched are being matched. If there are any I missed, a new test will need to be added.

@seproDev seproDev added pending-review PR needs a review and removed pending-fixes PR has had changes requested labels May 9, 2024
@bashonly bashonly removed the pending-review PR needs a review label May 11, 2024
@seproDev seproDev merged commit 800a439 into yt-dlp:master May 11, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
site-bug Issue with a specific website
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants