New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"URL couldn't be processed: %s" during callinf of find_date() #53
Comments
Hi @HubLubas, I cannot reproduce the bug:
Are you using the last version? Which system are you on? |
Hi @adbar I'm writing in Jupyter notebook on Google Collab. |
I found the reason of problem. |
Hi @jifan-chen, an error is raised because the web page couldn't be downloaded. You can try to use another download utility or an archived version of the page. Then you can use this pre-existing HTML file as input to the |
I got a problem with exctracting date from website.
date = find_date('https://uk.investing.com/news/astrazeneca-earnings-revenue-beat-in-q4-2582731');
I got such an error:
ValueError Traceback (most recent call last)
in ()
----> 1 date = find_date('https://uk.investing.com/news/astrazeneca-earnings-revenue-beat-in-q4-2582731');
1 frames
/usr/local/lib/python3.7/dist-packages/htmldate/core.py in find_date(htmlobject, extensive_search, original_date, outputformat, url, verbose, min_date, max_date)
598 if verbose is True:
599 logging.basicConfig(level=logging.DEBUG)
--> 600 tree = load_html(htmlobject)
601 find_date.extensive_search = extensive_search
602 min_date, max_date = get_min_date(min_date), get_max_date(max_date)
/usr/local/lib/python3.7/dist-packages/htmldate/utils.py in load_html(htmlobject)
165 # log the error and quit
166 if htmltext is None:
--> 167 raise ValueError("URL couldn't be processed: %s", htmlobject)
168 # start processing
169 tree = None
ValueError: ("URL couldn't be processed: %s", 'https://uk.investing.com/news/astrazeneca-earnings-revenue-beat-in-q4-2582731')
I will be gratefull for any support and help with this.
The text was updated successfully, but these errors were encountered: