Skip to content
This repository has been archived by the owner on Apr 11, 2024. It is now read-only.

Webtoon and error 10054 #459

Open
bumbaras opened this issue Aug 31, 2022 · 10 comments
Open

Webtoon and error 10054 #459

bumbaras opened this issue Aug 31, 2022 · 10 comments
Labels

Comments

@bumbaras
Copy link

bumbaras commented Aug 31, 2022

Since some time i have started to get error 10054 occassionally (the connection was abruptly closed by remote host). Error like error, manga-py doesn't stop at this and tries again and is able to continue download so one may say it is not issue at all. The closed connection almost always occurred on the end of given chapter, very rarely it occurrs in the middle of chapter. It occurrs usually for each 5 to 8 chapters.
Although it seems there is no problem when closed connection occurred in the middle of chapter because manga-py is able to resume download and archive seems complete, today I have found that manga-py omitts the chapter that should follow right after the closed connection if the error occurred when previous chapter was completed. So if the error occurred when next chapter should started then this chapter will be missing.

Parse chapters. Please, wait
100% (68 of 68) |##############################| Elapsed Time: 0:01:00 Time: 0:01:00
100% (79 of 79) |##############################| Elapsed Time: 0:00:52 ETA: 00:00:00
[ConnectionError(ProtocolError('Connection aborted.', ConnectionResetError(10054, 'Istniejące połączenie zostało gwałtownie zamknięte przez zdalnego hosta', None, 10054, None)))]
100% (79 of 79) |##############################| Elapsed Time: 0:01:13 Time: 0:01:13

Because there is no info what chapter is currently under download there is no way to notice that something is missing. I asked about it in feature request some time ago. As in example the error occurred just before next chapter should started (ETA shows 0s) and the progress after error shows the chapter before error. There is one more to download but it didn't start. Workaround for now is to just repeat download manually.

@1271 1271 added the bug label Aug 31, 2022
@bumbaras
Copy link
Author

bumbaras commented Sep 2, 2022

Not sure it will help with 10054 and if is it worth to put in Your code but i modified parser.py a little in def check_url(self, url):
embracing with get(url, stream=True, proxies=proxies) as response: with

        did_pass = False
        while not did_pass:
            try:
                with get(url, stream=True, proxies=proxies) as response:
                    _url = response.url
                    if url != _url:
                        url = _url
                did_pass = True
            except requests.exceptions.RequestException:
                print('Failed to connect to %s' % url)
                pass

I wasn't able to get error code though. With catching exception script repeats connect without break. It's silly modification of course ...

@1271
Copy link
Member

1271 commented Sep 2, 2022

If you want to stop the script as a whole, you need some flag that you have to check in the main loop (in general, where the methods are called)
Those. if the flag has an invalid value, you abort.
Regarding this method. To exit the loop, you must throw an exception, or do a break/return
Otherwise you'll just be stuck there forever

@1271
Copy link
Member

1271 commented Sep 2, 2022

As for the method itself, it only checks the url for redirects.
This is definitely not the reason for the behavior you described.

@bumbaras
Copy link
Author

bumbaras commented Sep 2, 2022

No, the reason i don't put raise there is that I don't want to throw exception. Python is still new to me and the internet resources are my teachers in free time. In the case where error will be thrown all time then this modification will make it infinite loop of displaying errors till user will break the run with ctrl+c. I thought about incrementing some variable and break from loop this way but this error occurrs only from time to time only so this is not the issue for me.
And i was looking for this one: [ConnectionError(ProtocolError('Connection aborted.', ConnectionResetError(10054, 'Istniejące połączenie zostało gwałtownie zamknięte przez zdalnego hosta', None, 10054, None)))] but still unsure where it is. Have few clues but manga-py is written in complex and complicated way for my current knowledge. But well, step after step and maybe i will learn something.

@1271
Copy link
Member

1271 commented Sep 2, 2022

You're right. This project is not the best way to learn python
Initially it was just a script from 1 file "for myself", but in the end I added more and more code to it, not caring about the architecture (unfortunately)
I wanted to write manga-py many times, but I didn't have the time and/or enthusiasm.
After all, at the moment I'm too busy at work, and on weekends I have no desire to do anything, unfortunately.
Perhaps one day I will return to development and still rewrite everything here, but I can’t say exactly when.

@bumbaras
Copy link
Author

bumbaras commented Sep 2, 2022

I know the feeling being overworked, and leaving weekends for anything but not something like work again :). I may say I am impressed about how You know what to change and the fact You made many changes and it still works correctly. I use it only for webtoons, tried other sites I know but they don't want to cooperate - either it is cloudflare or changes in website code. The fact it works with webtoons is enough for me.

@bumbaras
Copy link
Author

bumbaras commented Sep 5, 2022

Seems the place i was looking for:

def _requests_helper(self, method, url, **kwargs) -> requests.Response:
    r = requests.request(method, url, **kwargs)

    if len(r.history):
        for i in r.history:
            self.__update_cookies(i)
    self.__update_cookies(r)

    return r

in manga-py\http\request.py.
I have put my "infinite loop" there and seems no more missing chapters.

@bumbaras
Copy link
Author

After few adjustments I made in code (so called by me "infinite loop") other error occured. Previous on very critical error manga-py was just threw exception and finished execution. Now it breaks on given chapter and takes for next one. But the temp folder is not cleared from old files so next chapter contains previously downloaded pages - manga-py is not redownloading existing files. And yesterday I have found that even if given chapter is downloaded correctly, manga-py is deleting only number of images what downloaded chapter should contain. In short - if crashed chapter contained 70 images and manga-py interrupted chapter on image 65 (error occurred) all downloaded images will not be deleted and will be put into next chapter without redownloading.
Moreover if second chapter should contain less images, for example 60, then images from 61 to 65 will not be deleted even if the chapter is flagged as downloaded properly, and images from 61 to 65 will be put into next chapter which should contain such number. The last issue I have found yesterday so it seems some of my downloads till now will have broken chapters. I still don't get all the code of manga-py but I will have to somehow add removing temp function after each chapter.
BTW. is manga-py development still goes on?

@1271
Copy link
Member

1271 commented May 29, 2023

@bumbaras
It's a difficult question. Yes and no.
A lot of sites were placed behind cloudflare, this caused a lot of problems (and not only for manga-py, but also for real users)
Plus, maintaining changes on a large number of sites was exhausting. Sorry it happened.
I tried several times to rewrite everything, but in the end I gave up.
Plus, the last couple of years have been really stressful. Perhaps in the future I will still make something like manga-py3, but I'm not sure.

Regarding your question:

Yes, images will not be deleted if an exception was thrown (it would be nice to fix this)
No, manga-py does not keep a list of files that should be added to the archive (instead, the script scans the temporary directory and adds everything it finds to the archive)
This behavior should have changed. I thought about adding files to the archive as I get it (immediately after downloading), but in the end it works the way it works.

// I hope the translator conveyed everything correctly. Sorry my english is really bad

@bumbaras
Copy link
Author

bumbaras commented Jul 4, 2023

Yes, images will not be deleted if an exception was thrown (it would be nice to fix this)

Infinite loop gave me more but different issues, which only occurred on exception throw. But it still is better now.

No, manga-py does not keep a list of files that should be added to the archive (instead, the script scans the temporary directory and adds everything it finds to the archive)

I can't agree with this one - i put additional images with proper names to temp folder and:

  1. existing images weren't replaced with proper ones and were put into archive.
  2. any additional images which shouldn't be in chapter, for example images with numbers higher than chapter should have
    were left intact - didn't appeared in archive and weren't deleted by program. I am using manga-py only for webtoons,
    so maybe different provider means different behaviour of manga-py.

// I hope the translator conveyed everything correctly. Sorry my english is really bad

My english is much worse although i don't use translator for syntax, and rarely used words like to be forgotten.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants