"URL couldn't be processed: %s" during callinf of find_date() #53

HubLubas · 2022-05-28T17:17:40Z

I got a problem with exctracting date from website.
date = find_date('https://uk.investing.com/news/astrazeneca-earnings-revenue-beat-in-q4-2582731');

I got such an error:

ValueError Traceback (most recent call last)
in ()
----> 1 date = find_date('https://uk.investing.com/news/astrazeneca-earnings-revenue-beat-in-q4-2582731');

1 frames
/usr/local/lib/python3.7/dist-packages/htmldate/core.py in find_date(htmlobject, extensive_search, original_date, outputformat, url, verbose, min_date, max_date)
598 if verbose is True:
599 logging.basicConfig(level=logging.DEBUG)
--> 600 tree = load_html(htmlobject)
601 find_date.extensive_search = extensive_search
602 min_date, max_date = get_min_date(min_date), get_max_date(max_date)

/usr/local/lib/python3.7/dist-packages/htmldate/utils.py in load_html(htmlobject)
165 # log the error and quit
166 if htmltext is None:
--> 167 raise ValueError("URL couldn't be processed: %s", htmlobject)
168 # start processing
169 tree = None

ValueError: ("URL couldn't be processed: %s", 'https://uk.investing.com/news/astrazeneca-earnings-revenue-beat-in-q4-2582731')

I will be gratefull for any support and help with this.

The text was updated successfully, but these errors were encountered:

adbar · 2022-06-01T15:28:12Z

Hi @HubLubas, I cannot reproduce the bug:

>>> from htmldate import find_date
>>> find_date('https://uk.investing.com/news/astrazeneca-earnings-revenue-beat-in-q4-2582731')
'2022-02-10'

Are you using the last version? Which system are you on?

HubLubas · 2022-06-02T12:07:25Z

Hi @adbar

I'm writing in Jupyter notebook on Google Collab.
htmldate 1.2.1 /usr/local/lib/python3.7/dist-packages pip

adbar · 2022-06-02T15:40:44Z

I cannot reproduce it, it works for me:

HubLubas · 2022-06-02T16:16:36Z

I found the reason of problem.
I changed from
!pip install htmldate
to
!pip install -U htmldate
and now it works.
Thank you @adbar for answering the issue!

jifan-chen · 2023-01-05T22:08:09Z

I am facing the same issue again:

adbar · 2023-01-06T11:40:29Z

Hi @jifan-chen, an error is raised because the web page couldn't be downloaded.

You can try to use another download utility or an archived version of the page. Then you can use this pre-existing HTML file as input to the find_date() function.

adbar added the question Further information is requested label Jun 2, 2022

adbar closed this as completed Jun 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"URL couldn't be processed: %s" during callinf of find_date() #53

"URL couldn't be processed: %s" during callinf of find_date() #53

HubLubas commented May 28, 2022

adbar commented Jun 1, 2022

HubLubas commented Jun 2, 2022

adbar commented Jun 2, 2022

HubLubas commented Jun 2, 2022 •

edited

jifan-chen commented Jan 5, 2023

adbar commented Jan 6, 2023

"URL couldn't be processed: %s" during callinf of find_date() #53

"URL couldn't be processed: %s" during callinf of find_date() #53

Comments

HubLubas commented May 28, 2022

adbar commented Jun 1, 2022

HubLubas commented Jun 2, 2022

adbar commented Jun 2, 2022

HubLubas commented Jun 2, 2022 • edited

jifan-chen commented Jan 5, 2023

adbar commented Jan 6, 2023

HubLubas commented Jun 2, 2022 •

edited