Skip to content
This repository has been archived by the owner on May 4, 2021. It is now read-only.

Handle malformed publishing dates #27

Closed
pushrbx opened this issue Jan 5, 2018 · 0 comments
Closed

Handle malformed publishing dates #27

pushrbx opened this issue Jan 5, 2018 · 0 comments
Assignees
Labels

Comments

@pushrbx
Copy link
Owner

pushrbx commented Jan 5, 2018

Example page: https://myanimelist.net/manga/34
Published: Jul 1998 to Mar 1, 2000
There is no day number in the first section.

Traceback (most recent call last):
  File "C:\Users\pushrbx\.pyenvs\scraping-test\lib\site-packages\myanimelist\utilities.py", line 157, in parse_profile_date
    parsed_date = datetime.datetime.strptime(text, '%b %d, %I:%M %p')
  File "C:\Users\pushrbx\AppData\Local\Programs\Python\Python36-32\Lib\_strptime.py", line 565, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "C:\Users\pushrbx\AppData\Local\Programs\Python\Python36-32\Lib\_strptime.py", line 362, in _strptime
    (data_string, format))
ValueError: time data 'Jul 1998' does not match format '%b %d, %I:%M %p'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\pushrbx\.pyenvs\scraping-test\lib\site-packages\myanimelist\manga.py", line 116, in parse_sidebar
    publish_start = utilities.parse_profile_date(published_parts[0])
  File "C:\Users\pushrbx\.pyenvs\scraping-test\lib\site-packages\myanimelist\utilities.py", line 159, in parse_profile_date
    parsed_date = datetime.datetime.strptime(text, '%b %d, %Y %I:%M %p')
  File "C:\Users\pushrbx\AppData\Local\Programs\Python\Python36-32\Lib\_strptime.py", line 565, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "C:\Users\pushrbx\AppData\Local\Programs\Python\Python36-32\Lib\_strptime.py", line 362, in _strptime
    (data_string, format))
ValueError: time data 'Jul 1998' does not match format '%b %d, %Y %I:%M %p'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\pushrbx\.pyenvs\scraping-test\lib\site-packages\twisted\internet\defer.py", line 653, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "C:\Users\pushrbx\Documents\Projects\it-scraper\mal_crawler\spiders.py", line 181, in parse
    parse_result = manga.parse(page)
  File "C:\Users\pushrbx\.pyenvs\scraping-test\lib\site-packages\myanimelist\media.py", line 337, in parse
    media_info = self.parse_sidebar(media_page)
  File "C:\Users\pushrbx\.pyenvs\scraping-test\lib\site-packages\myanimelist\manga.py", line 119, in parse_sidebar
    message="Could not parse first of two publish dates")
myanimelist.manga.MalformedMangaPageError: Could not parse first of two publish dates
@pushrbx pushrbx self-assigned this Jan 5, 2018
@pushrbx pushrbx added the bug label Jan 5, 2018
@pushrbx pushrbx closed this as completed in 21c06f5 Jan 6, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant