You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Traceback (most recent call last):
File "indexer.py", line 53, in
content_trafilatura = trafilatura.extract(document, json_output=True, with_metadata=False, include_tables=False, deduplicate=True, include_comments=False)
File "/Users/luca/enviroments/3.7/lib/python3.7/site-packages/trafilatura/core.py", line 684, in extract
max_tree_size=max_tree_size, url_blacklist=url_blacklist
File "/Users/luca/enviroments/3.7/lib/python3.7/site-packages/trafilatura/core.py", line 586, in bare_extraction
docmeta = extract_metadata(tree, url, date_extraction_params)
File "/Users/luca/enviroments/3.7/lib/python3.7/site-packages/trafilatura/metadata.py", line 367, in extract_metadata
metadata['date'] = find_date(tree, **date_config)
File "/Users/luca/enviroments/3.7/lib/python3.7/site-packages/htmldate/core.py", line 605, in find_date
original_date, min_date, max_date)
File "/Users/luca/enviroments/3.7/lib/python3.7/site-packages/htmldate/core.py", line 124, in examine_header
headerdate = tryfunc(elem.get('content'))
File "/Users/luca/enviroments/3.7/lib/python3.7/site-packages/htmldate/extractors.py", line 385, in try_ymd_date
customresult = custom_parse(string, outputformat, extensive_search, min_date, max_date)
File "/Users/luca/enviroments/3.7/lib/python3.7/site-packages/htmldate/extractors.py", line 302, in custom_parse
result = parse_datetime_as_naive(string)
File "/Users/luca/enviroments/3.7/lib/python3.7/site-packages/dateutil/parser/_parser.py", line 1374, in parse
return DEFAULTPARSER.parse(timestr, **kwargs)
File "/Users/luca/enviroments/3.7/lib/python3.7/site-packages/dateutil/parser/_parser.py", line 655, in parse
ret = self._build_naive(res, default)
File "/Users/luca/enviroments/3.7/lib/python3.7/site-packages/dateutil/parser/_parser.py", line 1241, in _build_naive
naive = default.replace(**repl)
OverflowError: signed integer is greater than maximum
The text was updated successfully, but these errors were encountered:
Sorry for the delay, but I lost the url of the page in question. As for the
snippet of html code I wrote, you don't have to take it into account
because I realised I misspelled the closing tags, and git won't let me
correct it. The problem I reported does not suffer from poorly formatted
html.
Il giorno mar 29 dic 2020 alle ore 13:19 Adrien Barbaresi <
notifications@github.com> ha scritto:
Traceback (most recent call last):
File "indexer.py", line 53, in
content_trafilatura = trafilatura.extract(document, json_output=True, with_metadata=False, include_tables=False, deduplicate=True, include_comments=False)
File "/Users/luca/enviroments/3.7/lib/python3.7/site-packages/trafilatura/core.py", line 684, in extract
max_tree_size=max_tree_size, url_blacklist=url_blacklist
File "/Users/luca/enviroments/3.7/lib/python3.7/site-packages/trafilatura/core.py", line 586, in bare_extraction
docmeta = extract_metadata(tree, url, date_extraction_params)
File "/Users/luca/enviroments/3.7/lib/python3.7/site-packages/trafilatura/metadata.py", line 367, in extract_metadata
metadata['date'] = find_date(tree, **date_config)
File "/Users/luca/enviroments/3.7/lib/python3.7/site-packages/htmldate/core.py", line 605, in find_date
original_date, min_date, max_date)
File "/Users/luca/enviroments/3.7/lib/python3.7/site-packages/htmldate/core.py", line 124, in examine_header
headerdate = tryfunc(elem.get('content'))
File "/Users/luca/enviroments/3.7/lib/python3.7/site-packages/htmldate/extractors.py", line 385, in try_ymd_date
customresult = custom_parse(string, outputformat, extensive_search, min_date, max_date)
File "/Users/luca/enviroments/3.7/lib/python3.7/site-packages/htmldate/extractors.py", line 302, in custom_parse
result = parse_datetime_as_naive(string)
File "/Users/luca/enviroments/3.7/lib/python3.7/site-packages/dateutil/parser/_parser.py", line 1374, in parse
return DEFAULTPARSER.parse(timestr, **kwargs)
File "/Users/luca/enviroments/3.7/lib/python3.7/site-packages/dateutil/parser/_parser.py", line 655, in parse
ret = self._build_naive(res, default)
File "/Users/luca/enviroments/3.7/lib/python3.7/site-packages/dateutil/parser/_parser.py", line 1241, in _build_naive
naive = default.replace(**repl)
OverflowError: signed integer is greater than maximum
The text was updated successfully, but these errors were encountered: