-
-
Notifications
You must be signed in to change notification settings - Fork 293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot download novel with an apostrophe #214
Comments
I think i have ever encounter that error before, it has something to do
with encoding on your comp(windows). i found this while search for this
issue "*Update:* Python 3.6
<https://docs.python.org/3.6/whatsnew/3.6.html#pep-528-change-windows-console-encoding-to-utf-8>
implements PEP 528: Change Windows console encoding to UTF-8
<https://www.python.org/dev/peps/pep-0528/>: *the default console on
Windows will now accept all Unicode characters.* Internally, it uses the
same Unicode API as the win-unicode-console package mentioned below
<https://github.com/Drekin/win-unicode-console>. print(unicode_string)
should just work now.". So i suggest :
1. update your pyhon to 3.6 or
2. using pip install win-unicode-console
…On Tue, Oct 22, 2019 at 5:54 PM KiraYamatoSD ***@***.***> wrote:
Hi, I have encountered an issue where the downloader stops working with
titles containing an apostrophe.
Example 1: https://babelnovel.com/books/the-school-s-omnipotent
-useless-garbage <http://url>
Example 2:
https://babelnovel.com/books/soldier-king-s-love-story-in-the-city
<http://url>
Attached is the log.
Z:\SharedFolder\python_apps\lightnovel-crawler>lightnovel-crawler -lll --multi --all --ignore --add-
source-url --suppress --format "epub" --source "https://babelnovel.com/books/the-school-s-omnipotent
-useless-garbage <https://babelnovel.com/books/the-school-s-omnipotent-useless-garbage>"
================================================================================
Lightnovel Crawler #2.16.0
https://github.com/dipu-bd/lightnovel-crawler
--------------------------------------------------------------------------------
<< LOG LEVEL: DEBUG
--------------------------------------------------------------------------------
! Input is suppressed
--------------------------------------------------------------------------------
Namespace(add_source_url=True, all=True, bot=None, chapters=None, extra={}, first=None, force=False,
ignore=True, last=None, list_sources=False, log=3, login=None, multi=True, novel_page='https://babelnovel.com/books/the-school-s-omnipotent-useless-garbage', output_formats=['epub'], output_path=None
, page=None, query=None, range=None, single=False, sources=False, suppress=True, volumes=None)
2019-10-22 18:45:24,560 [DEBUG] (urllib3.connectionpool)
Starting new HTTP connection (1): bit.ly:80
2019-10-22 18:45:25,635 [DEBUG] (urllib3.connectionpool)http://bit.ly:80 "GET /2yYyFGd HTTP/1.1" 301 132
2019-10-22 18:45:25,638 [DEBUG] (urllib3.connectionpool)
Starting new HTTPS connection (1): pypi.org:443
2019-10-22 18:45:27,851 [DEBUG] (urllib3.connectionpool)https://pypi.org:443 "GET /pypi/lightnovel-crawler/json HTTP/1.1" 200 14268
-> Press Ctrl + C to exit
2019-10-22 18:45:28,716 [WARNING] (DOWNLOADER)
CairoSVG was not loaded properly. SVG to PNG conversion will fail.
2019-10-22 18:45:28,719 [INFO] (APP)
Initialized App
2019-10-22 18:45:28,721 [INFO] (APP)
Detected URL input
2019-10-22 18:45:28,723 [INFO] (APP)
Initializing crawler for: https://babelnovel.com/
Retrieving novel info...https://babelnovel.com/books/the-school-s-omnipotent-useless-garbage
2019-10-22 <https://babelnovel.com/books/the-school-s-omnipotent-useless-garbage2019-10-22> 18:45:28,744 [DEBUG] (urllib3.connectionpool)
Starting new HTTPS connection (1): babelnovel.com:443
2019-10-22 18:45:32,364 [DEBUG] (urllib3.connectionpool)https://babelnovel.com:443 "GET / HTTP/1.1" 200 None
2019-10-22 18:45:39,459 [INFO] (BABELNOVEL)
Getting https://babelnovel.com/content-css?hash=a2d57dcc8e2040f0e577739a3c210406
2019-10-22 <https://babelnovel.com/content-css?hash=a2d57dcc8e2040f0e577739a3c2104062019-10-22> 18:45:40,987 [DEBUG] (urllib3.connectionpool)https://babelnovel.com:443 "GET /content-css?hash=a2d57dcc8e2040f0e577739a3c210406 HTTP/1.1" 200 Non
e
2019-10-22 18:45:40,987 [INFO] (BABELNOVEL)
Bad selectors: #PWUHVHPE, .ATOYDHBM, #XXNHGWPR, #HUYWZSND, #ZDFENBFN, .HPULPMNR, .WSMETHAH, #YMMPZIE
T, .SOGOCAKM, .LAHOTOGB, .NEXKCSUV, .GWNKTBBC, .FSSBCGNM, .BDFQLKSJ, .NECPSWPE, #VYNJGYTY, .THBTZSRT
, .SIRVIKXO, #EMVXBWRY, #DIKWQORG, .TCCHZEPN, #SIHGGFEZ, .NVGHFOHA, .GWBNRNBP, #HEOZHUZQ, .KZOSKRUS,
.BXYKSBAY, .XJOCEALK, .VESBQMHL, .VTDWQDMV, #XINFRKMG, .DOUSNMTR, .IZBNVAMB, .XQLZGAWK, #WNYNDLIM,
.CQOYNJOA, #XDHTNZAY, #KWJWJYBA, .SYVXHZCI, .DPYRQBMM, #UKEEEXUP, .ICFBHAKD, #NLBNUZED, #ELGQPNAX, #
QFTGNLAR, #KHQMBQRS, #ODJGHVCX, .SOMGGJDN, #HGMSPNZM, #FOZBTZON
2019-10-22 18:45:40,987 [INFO] (BABELNOVEL)
Canonical name: the-school-s-omnipotent-useless-garbage
2019-10-22 18:45:40,987 [DEBUG] (BABELNOVEL)
Visiting https://babelnovel.com/api/books/the-school-s-omnipotent-useless-garbage
2019-10-22 <https://babelnovel.com/api/books/the-school-s-omnipotent-useless-garbage2019-10-22> 18:45:41,766 [DEBUG] (urllib3.connectionpool)https://babelnovel.com:443 "GET /api/books/the-school-s-omnipotent-useless-garbage HTTP/1.1" 200 Non
e
2019-10-22 18:45:41,766 [INFO] (BABELNOVEL)
Novel ID: 5c233f27-c124-455a-bd0c-538f4c06cbae
2019-10-22 18:45:41,766 [INFO] (BABELNOVEL)
Novel title: The School\u2019s Omnipotent Useless Garbage
2019-10-22 18:45:41,766 [INFO] (BABELNOVEL)
Novel cover: https://img.babelchain.org/book_images/The School\u2019s Omnipotent Useless Garbage.jpg
2019-10-22 18:45:41,766 [DEBUG] (BABELNOVEL)
Visiting https://babelnovel.com/api/books/5c233f27-c124-455a-bd0c-538f4c06cbae/chapters?bookId=5c233
f27-c124-455a-bd0c-538f4c06cbae&page=0&pageSize=100&fields=id,name,canonicalName,hasContent
2019-10-22 <https://babelnovel.com/api/books/5c233f27-c124-455a-bd0c-538f4c06cbae/chapters?bookId=5c233f27-c124-455a-bd0c-538f4c06cbae&page=0&pageSize=100&fields=id,name,canonicalName,hasContent2019-10-22> 18:45:41,782 [DEBUG] (BABELNOVEL)
Visiting https://babelnovel.com/api/books/5c233f27-c124-455a-bd0c-538f4c06cbae/chapters?bookId=5c233
f27-c124-455a-bd0c-538f4c06cbae&page=1&pageSize=100&fields=id,name,canonicalName,hasContent
2019-10-22 <https://babelnovel.com/api/books/5c233f27-c124-455a-bd0c-538f4c06cbae/chapters?bookId=5c233f27-c124-455a-bd0c-538f4c06cbae&page=1&pageSize=100&fields=id,name,canonicalName,hasContent2019-10-22> 18:45:41,782 [DEBUG] (urllib3.connectionpool)
Starting new HTTPS connection (2): babelnovel.com:443
2019-10-22 18:45:42,561 [DEBUG] (urllib3.connectionpool)https://babelnovel.com:443 "GET /api/books/5c233f27-c124-455a-bd0c-538f4c06cbae/chapters?bookId=5c23
3f27-c124-455a-bd0c-538f4c06cbae&page=0&pageSize=100&fields=id,name,canonicalName,hasContent HTTP/1.
1" 200 None
2019-10-22 18:45:42,561 [DEBUG] (BABELNOVEL)
Visiting https://babelnovel.com/api/books/5c233f27-c124-455a-bd0c-538f4c06cbae/chapters?bookId=5c233
f27-c124-455a-bd0c-538f4c06cbae&page=2&pageSize=100&fields=id,name,canonicalName,hasContent
2019-10-22 <https://babelnovel.com/api/books/5c233f27-c124-455a-bd0c-538f4c06cbae/chapters?bookId=5c233f27-c124-455a-bd0c-538f4c06cbae&page=2&pageSize=100&fields=id,name,canonicalName,hasContent2019-10-22> 18:45:43,200 [DEBUG] (urllib3.connectionpool)https://babelnovel.com:443 "GET /api/books/5c233f27-c124-455a-bd0c-538f4c06cbae/chapters?bookId=5c23
3f27-c124-455a-bd0c-538f4c06cbae&page=2&pageSize=100&fields=id,name,canonicalName,hasContent HTTP/1.
1" 200 None
2019-10-22 18:45:44,244 [DEBUG] (urllib3.connectionpool)https://babelnovel.com:443 "GET /api/books/5c233f27-c124-455a-bd0c-538f4c06cbae/chapters?bookId=5c23
3f27-c124-455a-bd0c-538f4c06cbae&page=1&pageSize=100&fields=id,name,canonicalName,hasContent HTTP/1.
1" 200 None
2019-10-22 18:45:44,244 [INFO] (BABELNOVEL)
3 volumes and 262 chapters found
Traceback (most recent call last):
File "c:\python35\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "c:\python35\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Python35\Scripts\lightnovel-crawler.exe\__main__.py", line 9, in <module>
File "c:\python35\lib\site-packages\lncrawl\__init__.py", line 13, in main
start_app()
File "c:\python35\lib\site-packages\lncrawl\core\__init__.py", line 78, in start_app
raise err
File "c:\python35\lib\site-packages\lncrawl\core\__init__.py", line 75, in start_app
run_bot(bot)
File "c:\python35\lib\site-packages\lncrawl\bots\__init__.py", line 18, in run_bot
ConsoleBot().start()
File "c:\python35\lib\site-packages\lncrawl\bots\console.py", line 62, in start
self.app.get_novel_info()
File "c:\python35\lib\site-packages\lncrawl\core\app.py", line 130, in get_novel_info
print('NOVEL: %s' % self.crawler.novel_title)
File "c:\python35\lib\site-packages\colorama\ansitowin32.py", line 41, in write
self.__convertor.write(text)
File "c:\python35\lib\site-packages\colorama\ansitowin32.py", line 162, in write
self.write_and_convert(text)
File "c:\python35\lib\site-packages\colorama\ansitowin32.py", line 190, in write_and_convert
self.write_plain_text(text, cursor, len(text))
File "c:\python35\lib\site-packages\colorama\ansitowin32.py", line 195, in write_plain_text
self.wrapped.write(text[start:end])
File "c:\python35\lib\encodings\cp437.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2019' in position 17: character maps t
o <undefined>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#214?email_source=notifications&email_token=ABV5AW7BADZ5AFHVYORUGOLQP3LXTA5CNFSM4JDN6BPKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HTOZJDQ>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABV5AW72ZKL5CKAF2ZZKPNTQP3LXTANCNFSM4JDN6BPA>
.
|
I had a similar guess. I am reviewing all prints and logs in the next version to remove any unicode characters when on windows. |
MacOS and Linux has utf-8 support on terminal. only the damn command prompt is different. |
A similar error was reported here: tartley/colorama#219 I have used @KiraYamatoSD can you check the version 2.16.2 if this issue is still there? |
@dipu-bd Thanks for the fix. However, there is a new error after testing out the update. There are some chapters that cannot be crawl but it is available at the source site.
|
It is babelnovel.com specific issue. We are working to fix it. Closing this since main issue is solved |
Hi, I have encountered an issue where the downloader stops working with titles containing an apostrophe.
Example 1: https://babelnovel.com/books/the-school-s-omnipotent-useless-garbage
Example 2: https://babelnovel.com/books/soldier-king-s-love-story-in-the-city
Attached is the log.
The text was updated successfully, but these errors were encountered: