Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve documentation and usage of release_year #8524

Merged
merged 4 commits into from Nov 26, 2023

Conversation

seproDev
Copy link
Collaborator

@seproDev seproDev commented Nov 5, 2023

IMPORTANT: PRs without the template will be CLOSED

Description of your pull request and other information

  • Update docs to better reflect actual usage of release_year
  • Auto generate release year if not present
  • Fix HarpodeonIE, which reported an incorrect release date when only the year was known

Closes #7263

Template

Before submitting a pull request make sure you have:

In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check all of the following options that apply:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

Copilot Summary

馃 Generated by Copilot at cfea45e

Summary

馃搮馃帪锔忦煕狅笍

This pull request adds a new release_year field to the common metadata fields for extracted videos, and updates the Harpodeon extractor to use it. It also adds a logic to infer the release_year from the release_date if possible.

release_year field
added to InfoExtractor
autumn of metadata

Walkthrough

  • Add a new common field release_year to the InfoExtractor class and document its meaning and usage (link)
  • Remove an outdated documentation line that referred to release_year as an album-specific field (link)
  • Update the Harpodeon extractor to use the release_year field instead of the release_date field and convert the year to an integer or None (link, link)
  • Update the test cases for the Harpodeon extractor to reflect the change from release_date to release_year and use integers instead of strings (link, link, link)
  • Add a logic to the YoutubeDL class to calculate the release_year field from the release_date field if not explicitly set by the extractor (link)

- Update docs to better reflect actual usage of release_year
- Auto generate release year if not present
- Fix HarpodeonIE which reported an incorrect release date when only the year was known
@seproDev seproDev added the docs/meta/cleanup related to docs, code cleanup, templates, devscripts etc label Nov 5, 2023
@bashonly bashonly self-requested a review November 7, 2023 13:28
@seproDev seproDev added the pending-review PR needs a review label Nov 21, 2023
Comment on lines 2589 to 2590
if info_dict.get('release_year') is None and info_dict.get('release_date') is not None:
info_dict['release_year'] = int_or_none(info_dict['release_date'][:4])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing that gives me pause about this is that it will make so many otherwise-passing download tests now fail

Also could probably just use int() here, but either way is fine

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make an exception in the tester similar to

yt-dlp/test/helper.py

Lines 221 to 223 in 21dc069

# display_id may be generated from id
if test_info_dict.get('display_id') == test_info_dict.get('id'):
test_info_dict.pop('display_id')

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if info_dict.get('release_year') is None and info_dict.get('release_date') is not None:
info_dict['release_year'] = int_or_none(info_dict['release_date'][:4])
if not info_dict.get('release_year'):
info_dict['release_year'] = traverse_obj(info_dict, ('release_date', lambda x: int(x[:4])))

Copy link
Collaborator Author

@seproDev seproDev Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so should this be used as an exception in sanitize_got_info_dict?

    # release_year may be generated from release_date
    if test_info_dict.get('release_year') and test_info_dict.get('release_date'):
        test_info_dict.pop('release_year')

There shouldn't be that many IE's that return both a release_date and release_year.
I could find: ArchiveOrgIE, MonstercatIE, TrueIDIE (inconstant), and YoutubeIE

Copy link
Member

@pukkandan pukkandan Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can be situations in theory where the year doesn't match the date. I think this'd be better:

    if try_call(lambda: test_info_dict['release_year'] == int(test_info_dict['release_date'][:4])):
        test_info_dict.pop('release_year')

That way only the auto-generated value will be excluded

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and YoutubeIE

release_year is currently broken for Youtube if I remember right

seproDev and others added 2 commits November 23, 2023 20:26
Co-authored-by: pukkandan <pukkandan.ytdlp@gmail.com>
@bashonly bashonly removed the pending-review PR needs a review label Nov 23, 2023
@bashonly bashonly merged commit 1732ecc into yt-dlp:master Nov 26, 2023
15 checks passed
@seproDev seproDev deleted the release_year branch November 26, 2023 04:19
aalsuwaidi pushed a commit to aalsuwaidi/yt-dlp that referenced this pull request Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs/meta/cleanup related to docs, code cleanup, templates, devscripts etc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[bandcamp.com] add release_year metadata field
3 participants