New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Did I make a mistake? #137
Comments
|
I'm having a hard time understanding. How are you determining it is wrong? Are you sure you are not counting text nodes? I see you creating a list of children, but I don't see you filtering out just |
Yeah, in your output, I'm seeing |
But, can I get |
You want text nodes and element nodes? |
Yeah. |
And sorry for my naive knowledge. Now I got the difference between node and element. https://developer.mozilla.org/en-US/docs/Web/API/Node/nodeType |
CSS selectors return element nodes not text nodes. But you can get what you want I would select Here I'm going to target span with from bs4 import BeautifulSoup
HTML = """
<div>
<p id="first">Text 1</p>
<span>Text 2</span>
<span>Text 3</span>
Text 4
<span>Text 5</span>
</div>
"""
soups = BeautifulSoup(HTML, 'html.parser')
element = soups.select_one('span:nth-child(3)')
index = element.parent.index(element)
print(element.parent.contents[index + 1:]) Output ['\nText 4\n', <span>Text 5</span>, '\n'] Makes sense? |
But here is my problem. I want to view ebook with python which means I need to get and filter all text nodes of one website. And I want to use configure file to set the rules for different websites. So it might not good enough for you idea and for many other websites I just need to use one |
Unfortunately, selectors only return elements. You can leverage this with additional logic to get desired text nodes, but selectors won't return text nodes directly. I don't know enough about your project to make suggestions on approach, but it appears selectors are working as expected. |
Got it. Thanks. But would you like to support this in the future? (Although I checked the document of bs4 and CSS. In CSS, selectors are patterns used to select the element(s) you want to style.) |
Unfortunately, selecting text nodes doesn't make sense with the selectors, so I don't have plans to implement it. |
The text was updated successfully, but these errors were encountered: