Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible parsing issue #117

Closed
Code-Slave opened this issue May 5, 2021 · 26 comments
Closed

Possible parsing issue #117

Code-Slave opened this issue May 5, 2021 · 26 comments

Comments

@Code-Slave
Copy link

This channel is causing constant failures to idex

https://www.youtube.com/channel/UCuoMasRkMhlj1VNVAOJdw5w

ndex media from source "Channel58", attempted 6 times
Error: "(1366, "Incorrect string value: '\xEF\xBC\xAC\xEF\xBC\xAF...' for column tubesync.background_task.verbose_name at row 1")"

@meeb
Copy link
Owner

meeb commented May 5, 2021

Thanks I'll look into it. I don't think that's an issue with the channel. That error is from MySQL whining that the verbose name being set is invalid, which it isn't. Alas the schema and encoding for that is set in an upstream library (Django background task) so it might need some creative hackery to fix.

@Code-Slave
Copy link
Author

semi related. There is definitely something that is making tasks hang. It says they are running but nothing happening for 4 hours. if i delete that row in background_tasks everything starts up again. It would be nice to be able to do that through the gui. reset this task or if a dl task has been running for a set period then may pause it or skip?

@meeb
Copy link
Owner

meeb commented May 6, 2021

Yes there are a number of ongoing problems with the background tasks and scheduling reliability which didn't crop up in initial testing, like the other issue you created for #34 - most of these are handled within Django Background Tasks which is making it a little difficult to work around in this downstream application. There is a "reset tasks" button at the bottom of the Tasks tab, but that basically just deletes all scheduled tasks then re-scans every bit of media to determine if there are any outstanding tasks not scheduled yet which is a bit overkill for most of these "single task got stuck" issues. Reworking tasks along with how sources are initially catalogued are probably the two largest outstanding issues to fix before a 1.0 release.

@Code-Slave
Copy link
Author

Great. If you want me to test anything specifically justlet me know.

@Code-Slave
Copy link
Author

Just fyi, still getting the mysql server disappeared at least once or twice a day

@meeb
Copy link
Owner

meeb commented May 7, 2021

Thanks, I'll see what else can be done. It may end up requiring a recommended MySQL server config tweak.

@Code-Slave
Copy link
Author

Heres a consistant error

2021-05-14 15:10:58,256 [tubesync/INFO] Scheduling task to download thumbnail for: How To Build A Simple DIY TV Stand...Or Media Console...Or Whatever... ¯_(ツ)_/¯ from: https://i.ytimg.com/vi_webp/4C1Namjo0dM/maxresdefault.webp

i get 500 server errors when it tries that.
https://www.youtube.com/watch?v=4C1Namjo0dM

thats the video

@meeb
Copy link
Owner

meeb commented May 15, 2021

And the 500 error you get is specifically the "Incorrect string value" one? Or a timeout?

@Code-Slave
Copy link
Author

2021-05-15 10:08:46,625 [tubesync/INFO] Scheduling task to download thumbnail for: Wooden Tabletop Christmas Trees 🎄 from: https://i.ytimg.com/vi_webp/0h7e0IV_rzw/maxresdefault.webp

another one

I get no error in the log. just a 500 error on the site

@Code-Slave
Copy link
Author

it looks like when titles have the stupid emoji or whatever chars in it it kills it

@Code-Slave
Copy link
Author

tESTING OUT THIS

ALTER TABLE your_database_name.your_table CONVERT TO CHARACTER SET utf8;
i ran that on background tasks and completed background tasks and it looks like its working. testing it out

@Code-Slave
Copy link
Author

That fixes the ndex media from source "Channel58", attempted 6 times
Error: "(1366, "Incorrect string value: '\xEF\xBC\xAC\xEF\xBC\xAF...' for column tubesync.background_task.verbose_name at row 1")"

it seems

@meeb
Copy link
Owner

meeb commented May 15, 2021

Thanks, that should indeed fix it. I think I never discovered this because I create all MySQL databases with utf8 encoding by default which is inherited (I think? I'm not really a MySQL guy) by tables when they are created. If you had created the initial database with CREATE DATABASE tubesync rather than CREATE DATABASE tubesync CHARACTER SET utf8 for example is possibly the cause of this. I'll look at tweaking the docs and there's an array of Django connection parameters related to this for MySQL that might work.

Thanks for digging into this.

@Code-Slave
Copy link
Author

Is there a command line reset all tasks? cause of my channel sizes i think its timing out and workers dies

@meeb
Copy link
Owner

meeb commented May 16, 2021

@Code-Slave
Copy link
Author

the emojis in titles are still killing it for thumbnails names

@Code-Slave
Copy link
Author

I think im getting close. See here
https://stackoverflow.com/questions/20411440/incorrect-string-value-xf0-x9f-x8e-xb6-xf0-x9f-mysql

i thing straight utf8 wont work. and you may have to add charset to connect string. crating a new db to test

@meeb
Copy link
Owner

meeb commented May 16, 2021

Yeah, try:
USE tubesync; ALTER TABLE background_task.verbose_name CONVERT TO CHARACTER SET utf8mb4;
or similar.

@Code-Slave
Copy link
Author

its all utf8mb4. i think the charset needs to be set in the connect string too. reading that link they say it does

@Code-Slave
Copy link
Author

https://stackoverflow.com/questions/15943938/django-charset-and-encoding

i think encoding needs to be passed. i recreated the db as utf8mb4 and it still failed

@meeb
Copy link
Owner

meeb commented May 16, 2021

utf8 was already the default for MySQL connections from Django, I've changed this to utf8mb4 though in :latest - give that a try and see if it helps.

@Code-Slave
Copy link
Author

Welp so far it added those tasks. db is utf8mb4 and with latest so far so good. Thanks for working on this. I was on a mission this morning

@Code-Slave
Copy link
Author

reindexing 17 sources now so will see how it goes

@meeb
Copy link
Owner

meeb commented May 16, 2021

No problem, it's pretty important MySQL works as a stable backend so thanks for testing!

@Code-Slave
Copy link
Author

No issues re the emoji now. I think the docs need to remind to create the db with that encoding

Still losing connection to the db and if its in the middle of a dl then that item stays locked in background task with a pid that no longer exists. It also keeps any other tasks from starting up

@meeb meeb closed this as completed in 874c71b May 17, 2021
@meeb
Copy link
Owner

meeb commented May 17, 2021

OK, I've added the DB encoding note to the docs.

If you continue to get connection loss issues please open a new issue as that's a different MySQL related problem!

Thanks again for the testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants