Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MongoDB Writing Many Subscriptions at Once – Performance Issue #534

Open
l-ucky opened this issue Mar 28, 2022 · 18 comments
Open

MongoDB Writing Many Subscriptions at Once – Performance Issue #534

l-ucky opened this issue Mar 28, 2022 · 18 comments
Labels
is: bug Something isn't working meta: good first issue Good for newcomers meta: help wanted Help Request for our Codebase/Docs/...

Comments

@l-ucky
Copy link

l-ucky commented Mar 28, 2022

Describe the bug
When switching from the standard YoutubeDL-Material database to the recommended Mongo database I am having issues keeping an uninterrupted database connection for an unknown reason.

To Reproduce
Steps to reproduce the behavior:

  1. Set up ~30 new subscriptions with the -w argument to not overwrite files (purpose: to make YTDLM catalogue subscriptions in MongoDB with no redownloading content)
  2. Go to Settings -> Database in YTDLM web-gui and change the MongoDB from the standard string in the textbox, to the IP address, port and the trailing modifier for the ip/port number.
  3. Test connection to DB, it works no problem
  4. Observe the MongoDB Compass GUI program that is supposed to reflect changes of the DB on my server
  5. There are no changes made on MongoDB Compass, I am stuck at 6.2k "Documents" catalogued.
  6. Also YTDLM is using 100% of it's available resources. I limited the contianer to 10gb of RAM and 2 cores

Expected behavior
I expect to see maximum CPU and RAM output with no DB logging of "redownloaded with -w" files. I'll mention again that the -w argument makes it so no file is overwritten, so in theory it has been simply only writing database changes into MongoDB. This spontaneously happened, and I didn't adjust any settings from the point where the DB was writing normally and where it will not write anything now.

Screenshots
If applicable, add screenshots to help explain your problem.

Environment
Installed version: v4.2 - done You are up to date.

Installation type: docker
Docker tag: nightly
Commit hash: 88cc8d0
Build date: 2021-10-01

Ideally you'd copy the info as presented on the "About" dialogue
in YoutubeDL-Material.
(for that, click on the three dots on the top right and then
check "installation details". On later versions of YoutubeDL-
Material you will find pretty much all the crucial information
here that we need in most cases!)

Additional context
imgur link of YTDLM DB settings and MongoDB Compass https://i.imgur.com/WVKvfxa.png
Error Log:
disregard the last four lines with MongoDB errors, I was experimenting to see if I could fix it with trying different links, it doesn't work.

Settings
Main
Downloader
Extra
Database
Advanced
Users
Logs
2022-03-28T16:02:28.218Z INFO: Config items set using ENV variables.
2022-03-28T16:02:28.678Z INFO: YoutubeDL-Material v4.2 started on PORT 17442
2022-03-28T16:46:04.849Z ERROR: ERROR: [youtube] 1WPuKIByCSY: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] QLYtAXqyKzs: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] NMAl5hjDYIU: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] w7NsVm_Y9yA: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] j-Wk5fIdOrc: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] LSkJ80OLemY: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] A8ISC1Zx01Y: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] mkZK5grN01s: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] y75zv7uxXQ0: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] nXyQWbE2RfI: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] 7nJTt5tdKYM: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] 49trqlg6bMw: Sign in to confirm your age. This video may be inappropriate for some users.
2022-03-28T16:50:57.372Z ERROR: ERROR: [youtube] K5ECJwsgcTg: Sign in to confirm your age. This video may be inappropriate for some users.
2022-03-28T16:55:31.314Z INFO: Config items set using ENV variables.
2022-03-28T16:55:32.560Z INFO: YoutubeDL-Material v4.2 started on PORT 17442
2022-03-28T17:38:21.368Z ERROR: ERROR: [youtube] 1WPuKIByCSY: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] QLYtAXqyKzs: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] NMAl5hjDYIU: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] w7NsVm_Y9yA: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] j-Wk5fIdOrc: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] LSkJ80OLemY: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] A8ISC1Zx01Y: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] mkZK5grN01s: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] y75zv7uxXQ0: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] nXyQWbE2RfI: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] 7nJTt5tdKYM: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] 49trqlg6bMw: Sign in to confirm your age. This video may be inappropriate for some users.
2022-03-28T17:43:07.565Z ERROR: ERROR: [youtube] K5ECJwsgcTg: Sign in to confirm your age. This video may be inappropriate for some users.
2022-03-28T17:46:02.720Z INFO: Config items set using ENV variables.
2022-03-28T17:46:03.127Z INFO: YoutubeDL-Material v4.2 started on PORT 17442
2022-03-28T18:29:52.252Z ERROR: ERROR: [youtube] 1WPuKIByCSY: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] QLYtAXqyKzs: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] NMAl5hjDYIU: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] w7NsVm_Y9yA: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] j-Wk5fIdOrc: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] LSkJ80OLemY: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] A8ISC1Zx01Y: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] mkZK5grN01s: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] y75zv7uxXQ0: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] nXyQWbE2RfI: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] 7nJTt5tdKYM: Sign in to confirm your age. This video may be inappropriate for some users.
ERROR: [youtube] 49trqlg6bMw: Sign in to confirm your age. This video may be inappropriate for some users.
2022-03-28T18:35:27.384Z INFO: Config items set using ENV variables.
2022-03-28T18:35:27.770Z INFO: YoutubeDL-Material v4.2 started on PORT 17442
2022-03-28T18:41:00.558Z ERROR: connection timed out
2022-03-28T18:41:00.656Z ERROR: Failed to connect to MongoDB. Verify your connection string is valid.
2022-03-28T18:41:36.498Z ERROR: connection timed out
2022-03-28T18:41:36.499Z ERROR: Failed to connect to MongoDB. Verify your connection string is valid.
@l-ucky l-ucky added the is: bug Something isn't working label Mar 28, 2022
@l-ucky l-ucky changed the title [BUG] MongoDB Writing Many Subscripts at Once – Performance Issue Mar 28, 2022
@l-ucky l-ucky changed the title MongoDB Writing Many Subscripts at Once – Performance Issue MongoDB Writing Many Subscriptions at Once – Performance Issue Mar 28, 2022
@GlassedSilver
Copy link
Collaborator

I think I noticed something similar with heavily peaking CPU and memory usage, I was sadly not able to track down the concrete problem. Help with this by someone who is actually a programmer would be appreciated, since the creator is on hiatus, and I'm mostly dealing with support, testing and merging Dependabot and user PRs whilst crossing my fingers.

That being said, you're not on the latest nightly, so maybe try updating and see if that helps.

(The integrated version checker does only check against release-channel releases, not nightlies, hence every 4.2-based nightly will report as up to date as of now)

@GlassedSilver GlassedSilver added meta: help wanted Help Request for our Codebase/Docs/... meta: good first issue Good for newcomers labels Mar 29, 2022
@l-ucky
Copy link
Author

l-ucky commented Mar 29, 2022

I am using the Nightly build https://i.imgur.com/tS02dew.png I also just did a server restart, I'm having the programs on standby until I can get the fix. Thanks for the comment

@GlassedSilver
Copy link
Collaborator

Maybe try deleting the containers' images and recreating, because it seems that your Portainer instance used a locally cached version to deploy the stack.

The latest nightly is a few days old and whilst it has some performance issues it also fixed a few things. That being said I thank you for your continued interest in the project. Given that we have a rather sizable list of users I'm positive we'll see some more active development. :)

@GlassedSilver GlassedSilver pinned this issue Mar 29, 2022
@GlassedSilver
Copy link
Collaborator

GlassedSilver commented Mar 30, 2022

I suspect that the concurrent download limiter max_concurrent_downloads is not doing what it should be doing in downloader.js. (lines 156-173 as of the current commit.)

Probably something with the steps, because even though my instance is set to 5 it is at least pulling info for more than 5 items at once which always fires off a yt-dlp instance (or whatever your selected ytdl-fork binary may be).

This is probably a bit of an issue since spawning up to 100-200 instances at once (or more) - even if they just check for info - can probably get us into rate limiting area. Probably related to #523.

This is just my estimation, if I was any good at coding I'd quickly work on this, but I'll try my very best at trying to bughunt for this so help by any programmer willing to help requires as little additional effort as possible. :/

@l-ucky
Copy link
Author

l-ucky commented Mar 30, 2022

what kind of school do you need to understand that stuff?

@GlassedSilver
Copy link
Collaborator

Pardon me?

@l-ucky
Copy link
Author

l-ucky commented Mar 30, 2022

I'm wondering what type of education you need to look at the code you cited and understand it to identify an issue. Also, I repulled the nightly build for ytdlm and started mongoDB again.
https://i.imgur.com/Dtlwf52.png (good connection)
however I'm unable to transfer the local files ytdlm organized into mongoDB
https://i.imgur.com/KukaN1e.png

@GlassedSilver
Copy link
Collaborator

I completely derped, the thing I talked about was about subscriptions causing too much load, you're talking about transitioning to the MongoDB. Maybe I'm too tired and me hopping between multiple issues in this state got me confused.

Either way the code is in JavaScript, so someone well versed in it should be able to help.

That being said the creator of the application and I had a chat this evening, and he might check on some of the most urgent issues soon-ish, so that's some good news. :)

The transfer failing definitely looks odd to me, so we'll see what can be done about it. :)

Thank you for reporting the issue and hang in there a little bit.

@l-ucky
Copy link
Author

l-ucky commented Mar 30, 2022

subscriptions causing too much load

Yes, I think this may be still true, there was a lot of resource usage with ~30 subscriptions rewriting data (not video).

you're talking about transitioning to the MongoDB

Yes, this problem happened right now, apart of the debugging process for pulling the latest nightly build. I also hav 6.2k existing entries in MongoDB already

Glad you're in contact with Tzahi. Remember to sleep, it's good for you. I'll be on standby!

@GlassedSilver
Copy link
Collaborator

subscriptions causing too much load

Yes, I think this may be still true, there was a lot of resource usage with ~30 subscriptions rewriting data (not video).

you're talking about transitioning to the MongoDB

Yes, this problem happened right now, apart of the debugging process for pulling the latest nightly build.

Glad you're in contact with Tzahi. Remember to sleep, it's good for you. I'll be on standby!

Haha, thanks for the words of worry, but I'm fine in general, just need to push sleep time today so I can reschedule my sleep to a later time in the day (I only sleep in the morning or the evening, I'm a night owl and I also work at night) so I'll be free for some errands later this week.

And the rewriting of subscriptions wasn't a particular issue I think, moreover it was something that is up to queuing them inefficiently to begin with I take it.

A long queue with concurrent download limits set should never cause memory usage to go this overboard. Overwriting or not. :)

@l-ucky
Copy link
Author

l-ucky commented Apr 9, 2022

Hello, any updates on the issue?

@GlassedSilver
Copy link
Collaborator

Hello, any updates on the issue?

Maybe, maybe not.

I did figure out that CPU pinning helps the docker container behave a lot better and if you pull the latest nightly or at least update your yt-dlp binary YTDL-M uses by either restarting the container or issuing the update command to yt-dlp within the docker's command prompt you should also be looking at much better downloading performance, i.e. WAY fewer failed downloads, which ultimately helps not suffering from an ever growing job queue.

I also chose to pause some of my subscriptions that are a bit too busy and enabling them one by one, soy you don't have all those downloads queuing at once. (which can become a bit of a devil's loop)

@l-ucky
Copy link
Author

l-ucky commented Apr 10, 2022

I tried pausing my subscriptions and letting one download at a time. However, it didn't work and YTDLM didn't accept my request when editing the subscription profiles to 'pause'. I then tried to 'Kill All' from the settings menu and every single (30) subscription started up again while only briefly stopping, maybe 1-3 seconds it stopped while watching htop. There is a lot of writing, but MongoDB isn't picking it up at all.

I also cloned this project into my ~/etc/. Then I went into the /examples/ in this repo, then edited the docker-compose file with my personal information, and it mapped well, but im still having the same issue of YTDLM not sending the DB information it's compiling and sending it to MongoDB.

@l-ucky
Copy link
Author

l-ucky commented Apr 16, 2022

Hello, I'm having some unique problems with MongoDB and YTDLM. I set up a stack with never used file paths, and began to do a single video test to see if the DB and YTDLM would work well. I downloaded one 1 minute long video without any problems. Next I tried a subscription, and YTDLM queued ~250 videos, then my CPU hit 100% usage and I could hear the fans spinning hard. Now, I have my one sub paused, and i went into the download section to stop every single download manually because killing them did nothing.

https://i.imgur.com/hobDJ00.png

  • The ~68% usage reflects 100% of the resources allotted to YTDLM, I allowed 2.5 cores and unrestricted RAM.

@GlassedSilver
Copy link
Collaborator

I heavily suggest limiting RAM as well, that's what I do. I do pin 4 cores (logical) but the RAM limit is what helped me get a smoother experience.

Please also make sure to set your concurrent download limit rather low, at the very least when you start new subs or add lots at once or have to pick up again after some considerable downtime (and hence a certain backlog of videos to load).

250 videos queued at once certainly will cause some pain until download queuing is refined. The download manager component is fairly "new" as in it got introduced and then shortly after the work on it has paused, that's why it's still in a bit of a beta stage, but it's miles better than what we had before and lets us fine-tune downloads at least to some workable degree. :)

Setting it up correctly can be finicky, but setting limits helps greatly.

Pausing subs, pausing downloads, setting the limit on concurrent downloads and then unpausing the downloads is key.

Killing downloads is not meant to do what you hoped to get from it as you found out. (at least I share your experience)

Hope that these steps will yield you better results. There will still be some other issues that need to be tackled next, but I'm doing support, bug hunting and concepts, if I could code myself I'd be right on it. :D

I had a chat with Tzahi recently and he's up to looking at some pain points again, but spare time is at a premium for him at the moment, so whilst I'm very open for PRs that's all I can say for now. :)

@l-ucky
Copy link
Author

l-ucky commented Apr 16, 2022

Hey thanks for your feedback. I tried your steps, and I'm having no progress. I am getting write errors like this:

2022-04-16T20:12:12.314Z ERROR: Error while retrieving info on video with URL https://www.youtube.com/watch?v=GktsBwCdQJU with the following message: Error: Command failed with exit code 1: /app/node_modules/youtube-dl/bin/youtube-dl -o subscriptions/channels/Adoration of The Cross/%(title)s.%(ext)s --write-info-json --print-json -f '(mp4)[height=720' --write-thumbnail -r 200k --no-clean-infojson --dump-json http://www.youtube.com/watch?v=GktsBwCdQJU
ERROR: [youtube] GktsBwCdQJU: Requested format is not available. Use --list-formats for a list of available formats

I can do single downloads, but no playlist or channel subscriptions. Would you recommend me doing a fresh download, agian?

@GlassedSilver
Copy link
Collaborator

Hey thanks for your feedback. I tried your steps, and I'm having no progress. I am getting write errors like this:

2022-04-16T20:12:12.314Z ERROR: Error while retrieving info on video with URL https://www.youtube.com/watch?v=GktsBwCdQJU with the following message: Error: Command failed with exit code 1: /app/node_modules/youtube-dl/bin/youtube-dl -o subscriptions/channels/Adoration of The Cross/%(title)s.%(ext)s --write-info-json --print-json -f '(mp4)[height=720' --write-thumbnail -r 200k --no-clean-infojson --dump-json http://www.youtube.com/watch?v=GktsBwCdQJU
ERROR: [youtube] GktsBwCdQJU: Requested format is not available. Use --list-formats for a list of available formats

I can do single downloads, but no playlist or channel subscriptions. Would you recommend me doing a fresh download, agian?

That never hurts I think, but I think you might be limited by the format limit of 720p and our pre-defined forced container of mp4, which is something we need to tackle fairly soon. #544 tracks this

@Tzahi12345
Copy link
Owner

Hi @l-ucky, apologies for the delay! Looks like you already did some rate limiting but can you also set your max concurrent downloads to 1 or 2? I did just test this and it looks like it's working as expected.

image

The other error you're seeing of:

ERROR: [youtube] GktsBwCdQJU: Requested format is not available. Use --list-formats for a list of available formats

Has been fixed and will be merged in #657 very shortly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is: bug Something isn't working meta: good first issue Good for newcomers meta: help wanted Help Request for our Codebase/Docs/...
Projects
None yet
Development

No branches or pull requests

3 participants