Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to select all nodes with class and specific embedded style attribute #57

Closed
underwater opened this issue May 14, 2017 · 2 comments
Closed
Labels
question Further information is requested

Comments

@underwater
Copy link

Just starting out with this library, and trying to extract all the elements in an html doc, that have a specific class name, and that also have a style attribute that has a specific inline defined value.

My elements are inside a hierarchy that looks like this
body > table > tbody > tr > td

The collection of p elements I am interested look like this

<p class="s11" style="padding-top: 1pt;padding-left: 1pt;text-indent: 0pt;line-height: 8pt;text-align: left;"> MY TEXT </p>

Any guidance would be appreciated

@atifaziz atifaziz added the question Further information is requested label May 15, 2017
@atifaziz
Copy link
Owner

You can get to the paragraph elements using p.s11[style]. This will get you all p elements with the class s11 and some attribute style. Then if you want to further refine the selection based on the actual content of the style attribute then you could continue to that in LINQ. Here's an example doing just that if the style attribute has CSS property named text-align with a value of left:

var q = 
    from e in doc.DocumentNode.QuerySelectorAll("p.s11[style]")
    where e.GetAttributeValue("style", string.Empty)
           .Split(';')
           .Select(p => p.Split(new[] { ':' }, 2))
           .Where(p => p.Length == 2)
           .Select(p => new { Name = p[0].Trim(), Value = p[1].Trim() })
           .Any(p => p.Name == "text-align" && p.Value == "left")
    select e;

Another option is to use regular expressions:

var q = 
    from e in doc.DocumentNode.QuerySelectorAll("p.s11[style]")
    where Regex.IsMatch(e.GetAttributeValue("style", string.Empty), @"\btext-align\s*:\s*left\b")
    select e;

@underwater
Copy link
Author

underwater commented May 15, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants