module 'lxml.etree' has no attribute '_ElementStringResult' error since v0.45.18 #2312

searchjaunt · 2024-04-17T12:59:17Z

DO NOT USE THIS FORM TO REPORT THAT A PARTICULAR WEBSITE IS NOT SCRAPING/WATCHING AS EXPECTED

This form is only for direct bugs and feature requests todo directly with the software.

Please report watched websites (full URL and any settings) that do not work with changedetection.io as expected IN THE DISCUSSION FORUMS or your report will be deleted

CONSIDER TAKING OUT A SUBSCRIPTION FOR A SMALL PRICE PER MONTH, YOU GET THE BENEFIT OF USING OUR PAID PROXIES AND FURTHERING THE DEVELOPMENT OF CHANGEDETECTION.IO

THANK YOU

Describe the bug
A huge amount of checks return module 'lxml.etree' has no attribute '_ElementStringResult'. Not all though, but the common factor with the errors is that website returning errors might have all an xpath filter. Not 100% sure though.

Version
v0.45.18

To Reproduce

Steps to reproduce the behavior:
Just do a check of a website with an xpath filter

! ALWAYS INCLUDE AN EXAMPLE URL WHERE IT IS POSSIBLE TO RE-CREATE THE ISSUE - USE THE 'SHARE WATCH' FEATURE AND PASTE IN THE SHARE-LINK!

Expected behavior
No errors and showing the difference with the last check

Screenshots

Desktop (please complete the following information):
not applicable

Smartphone (please complete the following information):
not applicable

Additional context
Seems to be reported in https://forum.cloudron.io/topic/11456/module-lxml-etree-has-no-attribute-_elementstringresult too

dgtlmoon · 2024-04-17T13:51:44Z

Please run `pip3 list` and tell me what version of lxml you have Also you didn't say if this install is a docker container or what it is...

…

On 17 April 2024 12:59:38 UTC, searchjaunt ***@***.***> wrote: **DO NOT USE THIS FORM TO REPORT THAT A PARTICULAR WEBSITE IS NOT SCRAPING/WATCHING AS EXPECTED** This form is only for direct bugs and feature requests todo directly with the software. Please report watched websites (full URL and _any_ settings) that do not work with changedetection.io as expected [**IN THE DISCUSSION FORUMS**](https://github.com/dgtlmoon/changedetection.io/discussions) or your report will be deleted CONSIDER TAKING OUT A SUBSCRIPTION FOR A SMALL PRICE PER MONTH, YOU GET THE BENEFIT OF USING OUR PAID PROXIES AND FURTHERING THE DEVELOPMENT OF CHANGEDETECTION.IO THANK YOU **Describe the bug** A huge amount of checks return module 'lxml.etree' has no attribute '_ElementStringResult'. Not all though, but the common factor with the errors is that website returning errors might have all an xpath filter. Not 100% sure though. **Version** v0.45.18 **To Reproduce** Steps to reproduce the behavior: Just do a check of a website with an xpath filter ! ALWAYS INCLUDE AN EXAMPLE URL WHERE IT IS POSSIBLE TO RE-CREATE THE ISSUE - USE THE 'SHARE WATCH' FEATURE AND PASTE IN THE SHARE-LINK! **Expected behavior** No errors and showing the difference with the last check **Screenshots** ![image](https://github.com/dgtlmoon/changedetection.io/assets/167319335/4bedd4af-48a3-4630-9a17-4e948f6b3830) **Desktop (please complete the following information):** not applicable **Smartphone (please complete the following information):** not applicable **Additional context** Seems to be reported in https://forum.cloudron.io/topic/11456/module-lxml-etree-has-no-attribute-_elementstringresult too -- Reply to this email directly or view it on GitHub: #2312 You are receiving this because you were assigned. Message ID: ***@***.***>

searchjaunt · 2024-04-17T14:24:05Z

Thx for the quick respons.
Sorry for not mentioning it, but it runs in a Docker container indeed.
A docker exec -it XXX pip3 list returns
lxml 5.2.1

dgtlmoon · 2024-04-17T16:46:48Z

Ok I can reproduce it, it is limited to xpath1 queries only

xpath1:/html/head/title

dgtlmoon · 2024-04-17T16:52:29Z

changedetection.io/changedetectionio/html_tools.py

Line 175 in d4dac23

if type(element) == etree._ElementStringResult:

In 5.1.1 lxml removed _ElementStringResult(), this was used to get the ->text() of a result #778 #751

searchjaunt · 2024-04-17T17:38:54Z

Thx for the investigation. Do you still need some information from my side? What is the next step?

dgtlmoon · 2024-04-17T17:50:46Z

@searchjaunt please paste me the exact selector you are using, visual-selector never generates text() type selectors afaik

…'_ElementStringResult' - reimplement _ElementStringResult (#2313 #2312)

searchjaunt · 2024-04-17T18:15:34Z

some random examples:
xpath1://article[@Class='page sticky grid gt-large'][1]

xpath1://table[@id='wegenwerkendata'][1]
xpath1://div[1]/div[2]/div[2]
xpath1://div[3]/div[2]/div[1]/div[1]/div[1]
xpath1://div[1]/section[1]/div[1]

I've never used the the visual selector

I have > 300 sites failing now.

dgtlmoon · 2024-04-17T18:33:44Z

Could you try 0.45.19 just released?

…

On 17 April 2024 18:15:55 UTC, searchjaunt ***@***.***> wrote: some random examples: ***@***.***='page sticky grid gt-large'][1] ![image](https://github.com/dgtlmoon/changedetection.io/assets/167319335/07a09f9d-692f-4d95-a9ff-a70710e4b0cb) ***@***.***='wegenwerkendata'][1] xpath1://div[1]/div[2]/div[2] xpath1://div[3]/div[2]/div[1]/div[1]/div[1] xpath1://div[1]/section[1]/div[1] I've never used the the visual selector I have > 300 sites failing now. -- Reply to this email directly or view it on GitHub: #2312 (comment) You are receiving this because you modified the open/close state. Message ID: ***@***.***>

dgtlmoon · 2024-04-17T20:37:24Z

xpath1://table[@id='wegenwerkendata'][1] xpath1://div[1]/div[2]/div[2] xpath1://div[3]/div[2]/div[1]/div[1]/div[1] xpath1://div[1]/section[1]/div[1]

note, this will only trigger if those elements are there, the error wont show otherwise

searchjaunt · 2024-04-17T20:47:31Z

I tried a couple of them and I'm getting the error
can only concatenate str (not "bytes") to str

now.
PS not sure if I understand your latest note

dgtlmoon · 2024-04-17T20:56:35Z

if i had your exact selectors when you reported the bug, then i would not have released a new version without testing your selectors :( none-the-less, thanks... i'll keep working at it

dgtlmoon · 2024-04-17T20:57:14Z

@searchjaunt any chance you can grace me with what URL you are watching that causes that? really need the most info possible

searchjaunt · 2024-04-17T21:18:40Z

Sure, here are two of them (the one I tried returning the last error)
https://stratenplan.gistel.be/gipod/wegeniswerken
xpath1://table[@id='wegenwerkendata'][1]

https://www.kortrijk.be/nieuws?f%5B0%5D=%3A&f%5B1%5D=categorie%3Amobiliteit
xpath1://div[3]/div[2]/div[1]/div[1]/div[1]

No other options or filtering and Basic fast Plaintext/HTTP Client (for the records, it does occur with WebDriver Chrome/Javascript, no Playwright/Chrome installed).

dgtlmoon · 2024-04-17T21:30:21Z

does changing it to //table[@id='wegenwerkendata'][1] work?

xconverge · 2024-04-17T21:31:52Z

here are 3 where I see these issues

https://www.amd.com/en/support/chipsets/amd-socket-am4/x570
xpath1:/html/body/div[1]/main/div/div/div/div/div[1]/div[1]/div/div[2]/details[1]/div/div[1]/div/span/div/div[2]

https://www.boss.info/global/support/by_product/katana-50_mk2/updates_drivers/4d633c80-f506-440e-94ce-055aaba48df3/
xpath1:/html/body/form/div[4]/div[1]/article/div[2]/div[2]

https://www.arturia.com/products/audio/minifuse/resources
xpath1:/html/body/div/div[1]/main/section[9]/div/div[4]/div[2]/div[2]/div[1]/table/tbody

xconverge · 2024-04-17T22:09:41Z

removing xpath1: from each has them working again I think

searchjaunt · 2024-04-18T06:04:32Z

@dgtlmoon that works indeed. I hope that the the conclusion won't be that I need to remove xpath1 for > 300 sites.
I didn't add them myself but started appearing from a certain version of changedetection (can't recall which version).
Why did it work before 0.45.18?

dgtlmoon · 2024-04-18T08:59:57Z

@searchjaunt "Why did it work before 0.45.18?" because as you said its a container and the container was built differently, thats how containers work

dgtlmoon · 2024-04-18T09:32:47Z

@dgtlmoon that works indeed. I hope that the the conclusion won't be that I need to remove xpath1 for > 300 sites.
I didn't add them myself but started appearing from a certain version of changedetection (can't recall which version).
Why did it work before 0.45.18?

if you gave me better examples to test with from the very start then this wouldnt have happened, it was only because i was missing exact information, usually i never start working on a bug until i have the exact data someone is using, but this time i did and it bit me

dgtlmoon · 2024-04-18T10:00:35Z

I re-tested all situations mentioned above (all URLs and filters) and in the newest 0.45.20 they all pass

please try that version (0.45.20)

…gtlmoon#2312 dgtlmoon#2317)" This reverts commit 3ae9bfa.

searchjaunt · 2024-04-18T11:32:32Z

Just installed 0.45.20 and I still got an
'str' object has no attribute 'name'
for
https://www.depinte.be/werken
//div[1]/div[1]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[1]/div[1]

I explicitly removed xpath1:

other settings

nothing else

Some other things:
got some more false positives like

Apart from the spacing (don't know where it comes from since the since wasn't changed) there is no difference.

Despite being up to date, I get the message that there is a new version available

Constantin1489 · 2024-04-18T11:35:27Z

Hi, I made a mistake when I did xpath3.1. when I made "xpath:" to link elementpath lib(xpath3.1), I forgot to duplicate the original xpath1 with new "xpath1:" test.
I'm currently investigating this xpath1 problem. I'm sorry.

EDIT: remove '//' in prefix

Constantin1489 · 2024-04-18T12:18:33Z

@searchjaunt I can't reproduce the 'str' object has no attribute 'name' with v0.45.20

Add other test result.

searchjaunt · 2024-04-18T12:21:26Z

Still getting it though:

dgtlmoon · 2024-04-18T12:26:46Z

Once again you - failed to provide the exact URL and filters - failed to prove the version

…

On 18 April 2024 12:21:49 UTC, searchjaunt ***@***.***> wrote: Still getting it though: ![image](https://github.com/dgtlmoon/changedetection.io/assets/167319335/ab7cea69-9699-483b-acf2-a06a2adb40ce) -- Reply to this email directly or view it on GitHub: #2312 (comment) You are receiving this because you modified the open/close state. Message ID: ***@***.***>

searchjaunt · 2024-04-18T12:29:39Z

@dgtlmoon see #2312 (comment)
Just tried deleting and creating it again, but with the same result

Constantin1489 · 2024-04-18T12:34:52Z

@searchjaunt Could you run this command?
docker run -it -e LOGGER_LEVEL=CRITICAL --rm YOURCONTAINER_IMAGE bash -c 'pip3 list'

you can get the YOURCONTAINER_IMAGE(with the example image below mikebrady/shairport-sync:latest) of your running container with sudo docker ps.
like this

Constantin1489 · 2024-04-18T13:59:08Z

@searchjaunt Hi, I tried to reproduce the same thing with versions(18, 19, 20).. I couldn't reproduce 'str' object has no attribute 'name'

searchjaunt · 2024-04-18T14:17:39Z

@Constantin1489 did you try the URL
https://www.depinte.be/werken
with the xpath
//div[1]/div[1]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[1]/div[1]
and the other settings as mentioned in
#2312 (comment)

Constantin1489 · 2024-04-18T14:43:04Z

Yes!

Constantin1489 · 2024-04-18T15:33:12Z

@xconverge Also for the default xpath(XPath3.1).. That is why I didn't kill xpath1 and preserved the previous xpath syntax with 'xpath1:'

XPath3.1 function is important because when a user wants to use the syntax(xpath2~xpath3.1) obtained from SOF, in most cases, the person will fail. it's because lxml uses xpath1. also, python native xml xpath doesn't support all the syntax of xpath1. and Python native xml xpath is a little different than the XPath1 spec of W3C (especially namespace notation.).

I will soon publish the report repo about this subject(within two weeks? I'm cleaning codes now.). Spoiler alert! The number of tests is super huge. that shows why XPATH3.1 is possible without a problem in Python.(when the configuration is correct)

EDIT: So, basically there are pros and cons in xml or xpath parsers in Python. But the experience provided by elementpath lib is great because you can use xpath in the xpath spec without the problem.

searchjaunt · 2024-04-18T15:52:33Z

@Constantin1489 strange. So what can I do in order to debug/make it work? I find it rather strange that in the header is said that a new version is available whilst 20 is installed (see earlier screenshot).

Constantin1489 · 2024-04-18T15:54:39Z

@searchjaunt could you provide the command or script or dockerfile or docker-compose.yml how you run changedetectionio? Before posting here, please test the command you provide it actually works.

Also, Does the problem happen in all the watches?

navels · 2024-04-18T15:59:34Z

FYI I am also on 20 and am getting the "new version is available" banner. Installation is via this proxmox script: https://github.com/tteck/Proxmox/blob/main/ct/changedetection.sh

Constantin1489 · 2024-04-18T16:03:14Z

Ah sorry. I thought you were saying the syntax is not working. For the new version banner. that will disappear. @navels does your xpath1 syntax work?

dgtlmoon · 2024-04-18T16:26:49Z

Able to reproduce it with this shared watch https://changedetection.io/share/QtZ-94DW41sa on .20 , the error is actually now a different error 'str' object has no attribute '__name__'

When i use an earlier lxml version the error still exists so @searchjaunt this issue is unrelated, i will open a new one

dgtlmoon · 2024-04-18T16:31:24Z

Ok, this unrelated issue is now over at #2318 thanks @Constantin1489

dgtlmoon · 2024-04-19T22:13:22Z

tldr - fixed :)

searchjaunt added the triage label Apr 17, 2024

searchjaunt assigned dgtlmoon Apr 17, 2024

dgtlmoon added a commit that referenced this issue Apr 17, 2024

Closes #2312

0777017

dgtlmoon mentioned this issue Apr 17, 2024

module 'lxml.etree' has no attribute '_ElementStringResult' - reimplement _ElementStringResult #2313

Merged

dgtlmoon closed this as completed in #2313 Apr 17, 2024

dgtlmoon added a commit that referenced this issue Apr 17, 2024

Bug fix for newer lxml module - module 'lxml.etree' has no attribute …

7470790

…'_ElementStringResult' - reimplement _ElementStringResult (#2313 #2312)

dgtlmoon reopened this Apr 18, 2024

dgtlmoon added the bug Something isn't working label Apr 18, 2024

dgtlmoon added a commit that referenced this issue Apr 18, 2024

Bug fix - further work on lxml filter extract (#2313 #2312 #2317)

3ae9bfa

dgtlmoon closed this as completed Apr 18, 2024

Constantin1489 added a commit to Constantin1489/changedetection.io that referenced this issue Apr 18, 2024

Revert "Bug fix - further work on lxml filter extract (dgtlmoon#2313 d…

72b2f5f

…gtlmoon#2312 dgtlmoon#2317)" This reverts commit 3ae9bfa.

dgtlmoon removed the triage label May 2, 2024

module 'lxml.etree' has no attribute '_ElementStringResult' error since v0.45.18 #2312

module 'lxml.etree' has no attribute '_ElementStringResult' error since v0.45.18 #2312

Comments

searchjaunt commented Apr 17, 2024

dgtlmoon commented Apr 17, 2024 via email

searchjaunt commented Apr 17, 2024

dgtlmoon commented Apr 17, 2024

dgtlmoon commented Apr 17, 2024

searchjaunt commented Apr 17, 2024

dgtlmoon commented Apr 17, 2024

searchjaunt commented Apr 17, 2024

dgtlmoon commented Apr 17, 2024 via email

dgtlmoon commented Apr 17, 2024

searchjaunt commented Apr 17, 2024

dgtlmoon commented Apr 17, 2024

dgtlmoon commented Apr 17, 2024

searchjaunt commented Apr 17, 2024

dgtlmoon commented Apr 17, 2024

xconverge commented Apr 17, 2024

xconverge commented Apr 17, 2024

searchjaunt commented Apr 18, 2024

dgtlmoon commented Apr 18, 2024

dgtlmoon commented Apr 18, 2024

dgtlmoon commented Apr 18, 2024

searchjaunt commented Apr 18, 2024

Constantin1489 commented Apr 18, 2024 • edited Loading

Constantin1489 commented Apr 18, 2024 • edited Loading

searchjaunt commented Apr 18, 2024

dgtlmoon commented Apr 18, 2024 via email

searchjaunt commented Apr 18, 2024

Constantin1489 commented Apr 18, 2024 • edited Loading

Constantin1489 commented Apr 18, 2024

searchjaunt commented Apr 18, 2024

Constantin1489 commented Apr 18, 2024

Constantin1489 commented Apr 18, 2024 • edited Loading

searchjaunt commented Apr 18, 2024

Constantin1489 commented Apr 18, 2024 • edited Loading

navels commented Apr 18, 2024

Constantin1489 commented Apr 18, 2024 • edited Loading

dgtlmoon commented Apr 18, 2024 • edited Loading

dgtlmoon commented Apr 18, 2024

dgtlmoon commented Apr 19, 2024

Constantin1489 commented Apr 18, 2024 •

edited

Loading

Constantin1489 commented Apr 18, 2024 •

edited

Loading

Constantin1489 commented Apr 18, 2024 •

edited

Loading

Constantin1489 commented Apr 18, 2024 •

edited

Loading

Constantin1489 commented Apr 18, 2024 •

edited

Loading

Constantin1489 commented Apr 18, 2024 •

edited

Loading

dgtlmoon commented Apr 18, 2024 •

edited

Loading