Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Page Size Overrides per Channel #702

Open
wants to merge 1 commit into
base: testing
Choose a base branch
from

Conversation

Boo1098
Copy link

@Boo1098 Boo1098 commented Apr 27, 2024

This is a minimum viable product for adding per channel overwrites on the number of videos, shorts, and livestreams to query when refreshing subscriptions. Tested all 3 overrides and they worked. The current method of resetting the override is clunky (setting to negative number). Open to ideas on how to better implement that. I've also upended some of the build query in subscriptions and haven't fully tested if that messes with things, but have been running this on my install the past few days with no ill effects that I've found.

New channel -> about overwrites page:
image

Sorry I can't contribute to the burn down of maintainability items, I'm not too experienced and don't know where to start for any of the remaining open items.

@Boo1098 Boo1098 marked this pull request as ready for review April 27, 2024 23:01
@bbilly1
Copy link
Member

bbilly1 commented May 6, 2024

Thanks for taking a stab at that.

Current backlog:

Please be patient.

@bbilly1
Copy link
Member

bbilly1 commented May 15, 2024

OK, I took some time to look into this. That is some old code, I don't think I ever touched it since project beginning. :-)

I think it's time to break out the query building into a separate class there. That is more or less self contained. Then this also prepares us for future expansion if we ever need to add additional types.

This is what I came up with:

class VideoQueryBuilder:
    """Build queries for yt-dlp."""

    def __init__(self, config: dict, channel_overwrites: dict | None = None):
        self.config = config
        self.channel_overwrites = channel_overwrites or {}

    def build_queries(
        self, video_type: VideoTypeEnum | None, limit: bool = True
    ) -> list[tuple[VideoTypeEnum, int | None]]:
        """Build queries for all or specific video type."""
        query_methods = {
            VideoTypeEnum.VIDEOS: self.videos_query,
            VideoTypeEnum.STREAMS: self.streams_query,
            VideoTypeEnum.SHORTS: self.shorts_query,
        }

        if video_type:
            # build query for specific type
            query_method = query_methods.get(video_type)
            if query_method:
                query = query_method(limit)
                if query[1] != 0:
                    return [query]
                return []

        # Build and return queries for all video types
        queries = []
        for build_query in query_methods.values():
            query = build_query(limit)
            if query[1] != 0:
                queries.append(query)

        return queries

    def videos_query(self, limit: bool) -> tuple[VideoTypeEnum, int | None]:
        """Build query for videos."""
        return self._build_generic_query(
            video_type=VideoTypeEnum.VIDEOS,
            overwrite_key="subscriptions_channel_size",
            config_key="channel_size",
            limit=limit,
        )

    def streams_query(self, limit: bool) -> tuple[VideoTypeEnum, int | None]:
        """Build query for streams."""
        return self._build_generic_query(
            video_type=VideoTypeEnum.STREAMS,
            overwrite_key="subscriptions_live_channel_size",
            config_key="live_channel_size",
            limit=limit,
        )

    def shorts_query(self, limit: bool) -> tuple[VideoTypeEnum, int | None]:
        """Build query for shorts."""
        return self._build_generic_query(
            video_type=VideoTypeEnum.SHORTS,
            overwrite_key="subscriptions_shorts_channel_size",
            config_key="shorts_channel_size",
            limit=limit,
        )

    def _build_generic_query(
        self,
        video_type: VideoTypeEnum,
        overwrite_key: str,
        config_key: str,
        limit: bool,
    ) -> tuple[VideoTypeEnum, int | None]:
        """Generic query for video page scraping."""
        if not limit:
            return (video_type, None)

        if overwrite_key in self.channel_overwrites:
            overwrite = self.channel_overwrites[overwrite_key]
            return (video_type, overwrite)

        if overwrite := self.config["subscriptions"].get(config_key):
            return (video_type, overwrite)

        return (video_type, None)

That then should simplify get_last_youtube_videos to something like this:

def get_last_youtube_videos(
    self,
    channel_id,
    limit=True,
    query_filter=None,
    channel_overwrites=None,
):
    """get a list of last videos from channel"""
    query_handler = VideoQueryBuilder(self.config, channel_overwrites)
    queries = query_handler.build_queries(query_filter)
    last_videos = []

    for vid_type_enum, limit_amount in queries:
        obs = {
            "skip_download": True,
            "extract_flat": True,
        }
        vid_type = vid_type_enum.value

        if limit:
            obs["playlistend"] = limit_amount

        url = f"https://www.youtube.com/channel/{channel_id}/{vid_type}"
        channel_query = YtWrap(obs, self.config).extract(url)
        if not channel_query:
            continue

        last_videos.extend(
            [
                (i["id"], i["title"], vid_type)
                for i in channel_query["entries"]
            ]
        )

    return last_videos

This is minimally tested... Not sure if that covers all cases... But is much more explicit and less confusing to what I had before there and to what you had to work with. :-)

I'm also thinking that we might want to have a more sophisticated download for at some point, e.g. when you add a channel to the form, to download x amount of videos, or something like that...

Does that make sense?

Also you might want to rebase on testing branch, I pushed quite a few things since you branched here.

@Boo1098
Copy link
Author

Boo1098 commented May 18, 2024

Yeah this looks much simpler. I'll put it together into a commit

I'm also thinking that we might want to have a more sophisticated download for at some point, e.g. when you add a channel to the form, to download x amount of videos, or something like that...

Does that make sense?

This sounds great. It would fix the problem I had when setting up and importing all my youtube subscriptions where I really only wanted new videos so I had to manually ignore all the added videos. I think that's outside the scope of this PR but I'll keep it in mind.

Boo1098 added a commit to Boo1098/tubearchivist that referenced this pull request May 18, 2024
Based on bbilly1's code from their comment in tubearchivist#702
@Boo1098
Copy link
Author

Boo1098 commented May 18, 2024

Added your code, works great. Only had to make one change here since in the case the main config entry is False it means no videos of that type should be queried, rather than unlimited.

I also added a check that the channel override is not None as that is what it was set to when it was removed.

This is a minimum viable product. Tested all 3 overrides and they
worked. The current method of resetting the override is clunk (setting
to negative number). I've also upended some of the build query in
subscriptions and haven't fully tested if that messes with things.

Moved query building into its own class

Based on bbilly1's code from their comment in tubearchivist#702
@Boo1098
Copy link
Author

Boo1098 commented May 18, 2024

I also added a check that the channel override is not None as that is what it was set to when it was removed.

Looking into this more, I think this was from me not properly deleting entries from the channel overrides config when I was testing earlier. Fresh install does not need this check as the keys are actually deleted when the override is unset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants