Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: "Add to download queue" defaults to adding an entire channel if a short video URL is added. Downloading an entire channel does not download shorts. #385

Closed
2 tasks done
maltbeverage opened this issue Dec 10, 2022 · 6 comments
Labels
bug Something isn't working

Comments

@maltbeverage
Copy link

maltbeverage commented Dec 10, 2022

I've read the documentation

Operating System

Fedora 36, Docker, Latest image tag.

Your Bug Report

Describe the bug

The "Add to download queue" function in the Downloads page performs a few checks to determine if the URL provided is a video, playlist, or channel. If the URL does not match any of these checks, the URL is defaulted to a channel. As a result, if a shorts URL is used (example https://www.youtube.com/shorts/6QImkSXqwao), the detect_from_url function in helper.py will not correctly set the URL as a video, because there is no if statement to handle a URL matching "/shorts/".

Since there is no check for shorts in the URL, the default option on line 195 will return download type as a channel. When the entire channel is processed, all regular videos are add to the queue, but the shorts are not added since there is no processing for shorts in get_last_youtube_videos.

Currently the only way to download a short is to convert the URL to a standard video by replaceing the /shorts/ part of the URL with /watch?v= edit: shorts can be downloaded by video ID alone per the wiki.

When adding an entire channel to the download queue, short videos are not included becuase get_last_youtube_videos function in subscriptions.py is only checking for videos in https://www.youtube.com/channel/{channel_id}/videos.

Adding a shorts channel page, example https://www.youtube.com/@gensho_yasuda/shorts only loads normal videos.

As a result, it is not possible to download a short using the standard shorts URL, add an entire channel to download normal videos and short videos together, or subscribe to a channel to download new shorts. This may apply to streams as well though I did not test this.

Steps To Reproduce

  1. Click on "Add to download queue"
  2. Add the following short video path
https://www.youtube.com/shorts/kHQCJDo_RzI
  1. Click the "Add to download queue" button
  2. The short will fail to be added to the queue and all videos in the /videos/ section of the channel will be added to the queue.

Expected behavior

Two parts to this.

  1. When adding a short URL, that specific video should be downloaded. Adding the following if statement to line 194 of helper.py resolves this particular issue and treats URLs with /short/ as videos. I don't really know any python, so this may be a subpar fix.
        if parsed.path.startswith("/shorts/"):
            youtube_id = parsed.path.split("/")[2]
            _ = self.find_valid_id(youtube_id)
            return youtube_id, "video"

I believe that it might be better to not default to channel and instead throw an error if the URL does not match channel, video, playlist, shorts, streams, etc.

  1. There does not appear to be any provision to automatically download shorts and streams when subscribing to a channel or adding a channel to the download queue. This would be a a nice feature to have, though I suspect it might not be a good default feature and should probably be configurable. I think something similar to this request is outlined in [Feature Request]: Optionally downloading #Shorts and livestreams, and giving them their own page #368.

I think the only place that would need to be updated to add short and stream support at a really simple level is the get_last_youtube_videos function in subscriptions.py. It looks like there is only a check for the URL ending in /videos when looking up a channel ID. I think this would need to also perform a check to see if any shorts or streams exist and if so combine those together to add to the download queue.

I made an attempt that kind of worked, though I don't know python. I'm sure this is shitty code and has some unexpected problems 😭.

    def get_last_youtube_videos(self, channel_id, limit=True):
        """get a list of last videos from channel"""
        obs = {
            "skip_download": True,
            "extract_flat": True,
        }
        if limit:
            obs["playlistend"] = self.config["subscriptions"]["channel_size"]

        url_videos = f"https://www.youtube.com/channel/{channel_id}/videos"
        channel_videos = YtWrap(obs, self.config).extract(url_videos)
        if not channel_videos:
            channel_videos = {}
        
        url_shorts = f"https://www.youtube.com/channel/{channel_id}/shorts"
        channel_shorts = YtWrap(obs, self.config).extract(url_shorts)
        if not channel_shorts:
            channel_shorts = {}
        
        url_live = f"https://www.youtube.com/channel/{channel_id}/streams"
        channel_streams = YtWrap(obs, self.config).extract(url_live)
        if not channel_streams:
            channel_streams = {}

        channel = channel_videos | channel_shorts | channel_streams
        
        if not channel:
            return False

        last_videos = [(i["id"], i["title"]) for i in channel["entries"]]
        return last_videos

Relevant log output

Log after following steps to reproduce.

processing: https://www.youtube.com/shorts/DFd9_sDBMv0
ParseResult(scheme='https', netloc='www.youtube.com', path='/shorts/DFd9_sDBMv0', params='', query='', fragment='')
[{'url': 'UCpd0n9H-HdK4GplD5RGvIDg', 'type': 'channel'}]
[2022-12-10 15:04:29,492: INFO/MainProcess] Task home.tasks.extrac_dl[b3ef705b-2c4e-411e-9715-5175adb9f4cd] received
[2022-12-10 15:04:29,870: WARNING/ForkPoolWorker-16] xCXlzhS5Sz8: add to download queue
[2022-12-10 15:04:30,925: WARNING/ForkPoolWorker-16] wxpVWxkSS8g: add to download queue
[2022-12-10 15:04:31,853: WARNING/ForkPoolWorker-16] lhYCpd1k3P8: add to download queue
[2022-12-10 15:04:32,865: WARNING/ForkPoolWorker-16] jw5NWWyiW70: add to download queue

Anything else?

Apologies if this is a known issue. I tried to search for anything about shorts and not much came up. Thanks for all your work on this project. It really is useful and much appreciated.

@PhuriousGeorge
Copy link
Contributor

Just wanted to add - hell of a report, thanks for the detail! This is a known shortcoming with recent changes with Youtube/yt-dlp. and while not nearly as specific, you referenced a very related request (minus defaulting to /video) #368

@bbilly1
Copy link
Member

bbilly1 commented Dec 11, 2022

Thanks for collecting this here. This will require quite some significant changes to accommodate the changes in the YouTube interface recently. But it's also a good opportunity, as until now, we didn't had a good way to automatically filter out shorts from the download queue.

In any case, thanks for all the details, that is going to help a lot when working on that.

@bbilly1 bbilly1 added bug Something isn't working duplicate This issue or pull request already exists and removed duplicate This issue or pull request already exists labels Dec 11, 2022
@maltbeverage
Copy link
Author

Thanks for the update. Happy to help if you need a hand bug checking anything. I'm an awful programmer but I excel in finding new and exciting ways to break software.

@bbilly1
Copy link
Member

bbilly1 commented Dec 11, 2022

Currently the only way to download a short is to convert the URL to a standard video by replaceing the /shorts/ part of the URL with /watch?v=

BTW, you still can just pass the video ID, wiki, yt-dlp handles that correctly, and TA will add it as a regular video.

@maltbeverage
Copy link
Author

Currently the only way to download a short is to convert the URL to a standard video by replaceing the /shorts/ part of the URL with /watch?v=

BTW, you still just pass the video ID, wiki, yt-dlp handles that correctly, and TA will add it as a regular video.

That's a much better workaround. I clearly didn't RTFM. Thanks for that.

@bbilly1
Copy link
Member

bbilly1 commented Jan 14, 2023

Sorry, forgot to tag the commits here, but that is now handled properly in v0.3.1. Please update.

Thanks for reporting!

@bbilly1 bbilly1 closed this as completed Jan 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants