# Composite Selectors

Combining selectors is one of the most powerful features of `soupsavvy`. By allowing you to mix and match different selectors, `soupsavvy` offers a highly flexible and customizable way to refine your search criteria. In this demo, we'll explore **Higher Order Selectors** and demonstrate how they can be leveraged to perform complex searches effectively.

### Higher Order Selectors

Higher Order Selectors enable you to combine multiple selectors into a single, composite selector, enhancing your ability to target specific elements. Whether you're matching tags, attributes, or text content, these selectors can be used together to create more advanced search logic.

## Combinators

We'll begin with **Combinators**, which are inspired by the CSS concept of combinators. These allow you to pass multiple selectors and perform a search on a `Tag` object, mimicking the logic of CSS combinators but with the added power and flexibility of `soupsavvy`.

For more information on CSS combinators, you can refer to this [Mozilla guide](https://developer.mozilla.org/en-US/docs/Web/CSS/Child_combinator).

While `BeautifulSoup` provides methods like `select` and `select_one` to find tags using CSS selectors, these are often limited by the constraints of vanilla CSS. `soupsavvy` goes beyond these limitations, offering more complex conditions such as regular expressions and other advanced selection logic that can be seamlessly combined using these powerful selectors.

### DescendantCombinator

The **Descendant Combinator** is one of the simplest and most frequently used combinators in CSS. It selects elements that match a second selector only if they have an ancestor that matches the first selector. In CSS, this relationship is represented by a single space `" "` between two selectors. For example, the following CSS:

```css
.book .price
```

matches all tags with the class `price` that are descendants of tags with the class `book`. For more details on CSS combinators, refer to the [Mozilla](https://developer.mozilla.org/en-US/docs/Web/CSS/Descendant_combinator).

#### Using DescendantCombinator in `soupsavvy`

In `soupsavvy`, this logic is encapsulated in the `DescendantCombinator` class, which functions similarly to its CSS counterpart. The `DescendantCombinator` accepts two or more selectors and returns `Tag` objects that satisfy the descendant relationship between them.

Here's an example equivalent to the CSS above:

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import AttributeSelector, DescendantCombinator

soup = BeautifulSoup(
    """
        <p class="price">Price: $30</p>
        <div class="book">
            <span class="title">Animal Farm</span>
            <span class="price_section">
                <p class="price">Price: $20</p>
            </span>
        </div>
    """,
    features="lxml",
)

book_selector = AttributeSelector("class", value="book")
price_selector = AttributeSelector("class", value="price")
selector = DescendantCombinator(book_selector, price_selector)
selector.find(soup)

As mentioned in previously, `soupsavvy` provides alternative way of creating some composite selectors by using operators. More concise way to create `DescendantCombinator` is by using the `>>` operator, which acts as syntactic sugar:

```python
DescendantCombinator(left, right) == left >> right
```

Where left selector matches ancestor and right selector matches descendant.

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import AttributeSelector

soup = BeautifulSoup(
    """
        <p class="price">Price: $30</p>
        <div class="book">
            <span class="title">Animal Farm</span>
            <span class="price_section">
                <p class="price">Price: $20</p>
            </span>
        </div>
    """,
    features="lxml",
)

book_selector = AttributeSelector("class", value="book")
price_selector = AttributeSelector("class", value="price")
selector = book_selector >> price_selector
selector.find(soup)

The order of selectors in a `DescendantCombinator` is significant. `left >> right` is not the same as `right >> left`.

In [None]:
from soupsavvy import AttributeSelector, DescendantCombinator

book_selector = AttributeSelector("class", value="book")
price_selector = AttributeSelector("class", value="price")

print(
    "left >> right == DescendantCombinator(left, right):",
    book_selector >> price_selector
    == DescendantCombinator(book_selector, price_selector),
)
print(
    "left >> right == right >> left:",
    price_selector >> book_selector == book_selector >> price_selector,
)

#### Handling Multiple Selectors

`DescendantCombinator` allows you to chain together any number of selectors as positional arguments. When more than two selectors are provided, they are chained them in the order they appear, creating a more complex selection logic. This is similar to chaining selectors in CSS.

For instance, the following CSS:

```css
#available .book .price
```

matches all elements with the class `price` that are descendants of a `<div>` with the class `book`, which in turn is a descendant of a `<div>` with the ID `available`.

In `soupsavvy`, you can achieve this with `DescendantCombinator` as shown below:

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import ClassSelector, IdSelector, TypeSelector

soup = BeautifulSoup(
    """
        <p class="price">Price: $30</p>
        <div class="book">
            <span class="title">Animal Farm</span>
            <p class="price">Price: $10</p>
        </div>
        <div id="available">
            <div class="book">
                <span class="title">Animal Farm</span>
                <span class="price_section">
                    <p class="price">Price: $20</p>
                </span>
            </div>
        </div>
    """,
    features="lxml",
)

available_selector = IdSelector("available")
book_selector = ClassSelector("book")
price_selector = ClassSelector("price")
selector = available_selector >> book_selector >> price_selector
selector.find(soup)

#### Not Recursive search

Setting `recursive` parameter to `False` will return element only if element matched by first selector is a child of searched element. This logic is in place for all `soupsavvy` combinators.

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import AttributeSelector, TypeSelector, DescendantCombinator

soup = BeautifulSoup(
    """
        <p class="price">Price: $30</p>
        <span class="not_child_book">
            <div class="book">
                <span class="title">Animal Farm</span>
                <span class="price_section">
                    <p class="price">Price: $50</p>
                </span>
            </div>
        </span>
        <div class="book">
            <span class="title">Animal Farm</span>
            <span class="price_section">
                <p class="price">Price: $20</p>
            </span>
        </div>
    """,
    features="html.parser",
)

book_selector = AttributeSelector("class", value="book")
price_selector = AttributeSelector("class", value="price")
selector = book_selector >> price_selector
selector.find(soup, recursive=False)

In this case `Price: $50` is not selected, as it's ancestor `<div class="book">` that matched first selector is not a direct parent of searched element.

### ChildCombinator

`ChildCombinator` mirrors the behavior of the CSS child combinator. In CSS, a child combinator (`>` symbol) selects elements that are direct children of a specified parent element. This is a stricter relationship than the descendant combinator, as it only matches elements that are immediate children of a given element, not just any descendant.

For example, in CSS:

```css
div > .price
```

This selector matches all elements with the class `price` that are direct children of `<div>` element. For more details refer to [Mozilla](https://developer.mozilla.org/en-US/docs/Web/CSS/Child_combinator).

In `soupsavvy`, this logic is implemented with the `ChildCombinator` class. It accepts two or more selectors as positional arguments and returns elements that match the specified child-parent relationships.

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import AttributeSelector, ChildCombinator, TypeSelector

soup = BeautifulSoup(
    """
        <p class="price">Price: $30</p>
        <div>
            <span class="title">Animal Farm</span>
            <span class="discount">
                <h2>Discounted</h2>
                <p class="price">Price: $15</p>
            </span>
            <p class="price">Price: $20</p>
        </div>
    """,
    features="lxml",
)

book_selector = TypeSelector("div")
price_selector = AttributeSelector("class", value="price")
selector = ChildCombinator(book_selector, price_selector)
selector.find(soup)

More concise way to create `ChildCombinator` is by using the `>` operator, which is consistent with CSS syntax:

```python
ChildCombinator(left, right) == left > right
```

Where left selector matches parent and right selector matches child.

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import ClassSelector, TypeSelector

soup = BeautifulSoup(
    """
        <p class="price">Price: $30</p>
        <div>
            <span class="title">Animal Farm</span>
            <span class="discount">
                <h2>Discounted</h2>
                <p class="price">Price: $15</p>
            </span>
            <p class="price">Price: $20</p>
        </div>
    """,
    features="lxml",
)

book_selector = TypeSelector("div")
price_selector = ClassSelector("price")
selector = book_selector > price_selector
selector.find(soup)

### Combining Combinators

Just like any other `soupsavvy` selector, combinators can be used as inputs to other higher-order selectors. For instance, you can define a combination of parent-child and descendant relationships within a single selector. In CSS, this would be represented as:

```css
#available > div .price
```

In `soupsavvy`, this logic can be replicated by using both the `ChildCombinator` and the `DescendantCombinator` together. This combination allows you to specify that tag with id `available` should contain a direct child `<div>`, which in turn should contain a descendant element with the class `price`.

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import ClassSelector, IdSelector, TypeSelector

soup = BeautifulSoup(
    """
        <p class="price">Price: $30</p>
        <div>
            <span class="title">Animal Farm</span>
            <p class="price">Price: $10</p>
        </div>
        <div id="available">
            <div>
                <span class="title">Animal Farm</span>
                <span class="discount">
                    <h2>Discounted</h2>
                    <p class="price">Price: $15</p>
                </span>
                <p class="price">Price: $20</p>
            </div>
        </div>
    """,
    features="lxml",
)

available_selector = IdSelector("available")
book_selector = TypeSelector("div")
price_selector = ClassSelector("price")
selector = (available_selector > book_selector) >> price_selector
selector.find(soup)

When combining selectors with operators in `soupsavvy`, it's important to understand that some operators have higher precedence than others, which can affect the order in which expressions are evaluated.

For example, the `right shift` operator (`>>`) has higher precedence than the `greater than` operator (`>`). This means that in an expression like:

```python
left > middle >> right
```

The `>>` operation is performed first, resulting in:

```python
ChildCombinator(left, DescendantCombinator(middle, right))
```

In contrast, if you add parentheses to explicitly define the order of operations:

```python
(left > middle) >> right
```

This expression would be interpreted as:

```python
DescendantCombinator(ChildCombinator(left, middle), right)
```

In provided example, the order of selectors isn't significant because both the `ChildCombinator` and `DescendantCombinator` are commutative — meaning the order in which you apply them doesn't affect the final result. However, this may not always be the case, so to avoid ambiguity and ensure expected behavior, it's always safer to include parentheses when combining operators.

In [None]:
from soupsavvy import (
    ClassSelector,
    DescendantCombinator,
    IdSelector,
    TypeSelector,
    ChildCombinator,
)

available_selector = IdSelector("available")
book_selector = TypeSelector("div")
price_selector = ClassSelector("price")

print(
    DescendantCombinator(
        ChildCombinator(available_selector, book_selector), price_selector
    )
    == (available_selector > book_selector) >> price_selector
)

### NextSiblingCombinator

`NextSiblingCombinator` replicates the behavior of the CSS adjacent sibling combinator. In CSS, the next (aka adjacent) sibling combinator (denoted by the `+` symbol) selects an element that directly follows a specified sibling element.

For example, in CSS:

```css
div + .price
```

This selector matches all elements with the class `price` that are the immediate next siblings of `<div>` elements. For more details, refer to [Mozilla](https://developer.mozilla.org/en-US/docs/Web/CSS/Next-sibling_combinator).

In `soupsavvy`, the `NextSiblingCombinator` class implements this logic. It takes two or more selectors as positional arguments and returns elements that match the specified preceding-adjacent sibling relationships.

Example below demonstrate different example by using `PatternSelector` to match text content,
element with class `price` that is immediate sibling of element with text content `Discounted` will be selected.	  

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import ClassSelector, NextSiblingCombinator, PatternSelector

soup = BeautifulSoup(
    """
        <h2>Discounted</h2>
        <span>Unavailable</span>
        <p class="price">Price: $10</p>
        <h1>Discounted</h1>
        <p class="price">Price: $15</p>
    """,
    features="lxml",
)

discount_selector = PatternSelector("Discounted")
price_selector = ClassSelector("price")
selector = NextSiblingCombinator(discount_selector, price_selector)
selector.find(soup)

More concise way to create `NextSiblingCombinator` is by using the `+` operator, which is consistent with CSS syntax:

```python
NextSiblingCombinator(left, right) == left + right
```

Where left selector matches preceding sibling and right selector matches next sibling.

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import ClassSelector, PatternSelector

soup = BeautifulSoup(
    """
        <h2>Discounted</h2>
        <span>Unavailable</span>
        <p class="price">Price: $10</p>
        <h1>Discounted</h1>
        <p class="price">Price: $15</p>
    """,
    features="lxml",
)

discount_selector = PatternSelector("Discounted")
price_selector = ClassSelector("price")
selector = discount_selector + price_selector
selector.find(soup)

In case of having multiple selectors, they are chained in the order they appear, first selector matches element and each subsequent selector matches next sibling of previously matched element.

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import ClassSelector, PatternSelector

soup = BeautifulSoup(
    """
        <h2>Discounted</h2>
        <span>Breaking relationship :(</span>
        <p class="price">Price: $15</p>
        <span class="title">Animal Farm 1</span>
        
        <h2>Discounted</h2>
        <p class="price">Price: $15</p>
        <span>Breaking relationship :(</span>
        <span class="title">Animal Farm 2</span>
        
        <h2>Discounted</h2>
        <p class="price">Price: $15</p>
        <span class="title">Animal Farm 3</span>
    """,
    features="lxml",
)

discount_selector = PatternSelector("Discounted")
price_selector = ClassSelector("price")
title_selector = ClassSelector("title")
selector = discount_selector + price_selector + title_selector
selector.find(soup)

When using multiple selectors for Higher Order Selector, they can be *reduced* into one by chaining them with operator, `functools.reduce` can be used for this purpose in a convenient way.

In [None]:
from functools import reduce
from operator import add

from bs4 import BeautifulSoup

from soupsavvy import ClassSelector

soup = BeautifulSoup(
    """
       <span class="c1"></span>
       <span class="c2"></span>
       <span class="c3"></span>
       <span class="c4"></span>
       <span class="c5"></span>
    """,
    features="lxml",
)

selector = reduce(add, (ClassSelector(f"c{i}") for i in range(1, 6)))
selector.find(soup)

### SubsequentSiblingCombinator

`SubsequentSiblingCombinator` emulates the behavior of the CSS subsequent sibling combinator. In CSS, the subsequent sibling combinator (denoted by the `~` symbol) selects elements that follow the element matched by the first selector and share the same parent.

For example, in CSS:

```css
div ~ .price
```

This selector matches all elements with class `price` that are siblings of `<div>` and appear after it in document. For more details, refer to [Mozilla](https://developer.mozilla.org/en-US/docs/Web/CSS/Subsequent-sibling_combinator).

In `soupsavvy`, the `SubsequentSiblingCombinator` class provides similar functionality. It accepts two or more selectors as positional arguments and returns elements that match the specified preceding-subsequent sibling relationships. For more information on search logic when multiple selectors are provided, refer to the `NextSiblingCombinator` section.

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import ClassSelector, SubsequentSiblingCombinator, TypeSelector

soup = BeautifulSoup(
    """
        <p class="price">Price: $25</p>
        <h2>Discounted</h2>
        <span>Bargain!!!</span>
        <p class="price">Price: $15</p>
        <p class="price">Price: $10</p>
    """,
    features="lxml",
)

discount_selector = TypeSelector("h2")
price_selector = ClassSelector("price")
selector = SubsequentSiblingCombinator(discount_selector, price_selector)
selector.find_all(soup)

More concise way to create `SubsequentSiblingCombinator` is by using the `*` operator:

```python
SubsequentSiblingCombinator(left, right) == left * right
```

Where left selector matches preceding sibling and right selector matches subsequent sibling.

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import ClassSelector, TypeSelector

soup = BeautifulSoup(
    """
        <p class="price">Price: $25</p>
        <h2>Discounted</h2>
        <span>Bargain!!!</span>
        <p class="price">Price: $15</p>
        <p class="price">Price: $10</p>
    """,
    features="lxml",
)

discount_selector = TypeSelector("h2")
price_selector = ClassSelector("price")
selector = discount_selector * price_selector
selector.find(soup)

## Logical Selectors

Another category of Higher Order Selectors in `soupsavvy` is **Logical Selectors**. These selectors enables you to create new selectors by combining multiple selectors using logical operators like `AND`, `OR`, `NOT` and `XOR`.

### AndSelector

`AndSelector` is the `soupsavvy` equivalent of the CSS compound selector. In CSS, a compound selector is a sequence of simple selectors that are not separated by a combinator. An element matches a compound selector if it satisfies all the simple selectors in the sequence. For example, the following CSS compound selector:

```css
p.price
```

matches all `<p>` elements with the class `price`.

In `soupsavvy`, the `AndSelector` class achieves similar functionality. It accepts two or more selectors as positional arguments and returns elements that match **all** the specified selectors. This allows for highly specific and precise element selection. For more information about this selector refer to [Mozilla](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_selectors/Selector_structure#compound_selector).

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import (
    AndSelector,
    ClassSelector,
    TypeSelector,
)

soup = BeautifulSoup(
    """
        <p class="title">Animal Farm</p>
        <span class="price">Price: $30</p>
        <p class="price">Price: $20</p>
    """,
    features="lxml",
)

p_selector = TypeSelector("p")
price_selector = ClassSelector("price")
selector = AndSelector(p_selector, price_selector)
selector.find(soup)

More concise way to create `AndSelector` is by using the `&` operator:

```python
AndSelector(left, right) == left & right
```

Where both left and right selectors must match for element to be selected.

It accepts two or more selectors as positional arguments and returns elements that match **all** of them.

In [None]:
import re

from bs4 import BeautifulSoup

from soupsavvy import (
    AttributeSelector,
    ClassSelector,
    PatternSelector,
    TypeSelector,
)

soup = BeautifulSoup(
    """
        <p class="title">Animal Farm</p>
        <p class="price">Price: $30</p>
        <p class="price">Price: $20</p>
        <p class="price" href="/shop">Price: €12</p>
        <span class="price" href="/shop">Price: $18</p>
        <p class="price" href="/shop">Price: $15</p>
    """,
    features="lxml",
)

selector = (
    TypeSelector("p")
    & ClassSelector("price")
    & AttributeSelector("href", value="/shop")
    & PatternSelector(re.compile(r"\$\d+"))
)
selector.find(soup)

### SelectorList / OrSelector

`SelectorList` is `soupsavvy`'s counterpart of the CSS selector list. In CSS, a selector list is a comma-separated collection of selectors, and an element matches the list if it satisfies any of the individual selectors. For example, consider the following CSS selector list:

```css
h1, h2
```

This selector list matches all `<h1>` and `<h2>` elements.

In `soupsavvy`, `SelectorList` (also known as `OrSelector`, which is an alias) offers similar functionality. It accepts two or more selectors as positional arguments and returns elements that match **any** of the specified selectors.

For more information about refer to [Mozilla](https://developer.mozilla.org/en-US/docs/Web/CSS/Selector_list).

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import ClassSelector, OrSelector

soup = BeautifulSoup(
    """
        <p class="title">Animal Farm</p>
        <span class="discount">
            <p class="price">Price: $25</p>
        </span>
        <p class="title">Brave New World</p>
        <p class="price">Price: $15</p>
    """,
    features="lxml",
)

title_selector = ClassSelector("title")
price_selector = ClassSelector("price")
discount_price_selector = title_selector + ClassSelector("discount") > price_selector
standard_price_selector = title_selector + price_selector

selector = OrSelector(discount_price_selector, standard_price_selector)
selector.find_all(soup)

More concise way to create `SelectorList` is by using the `|` operator:

```python
SelectorList(left, right) == left | right
```
Where any of left or right selector must match for element to be selected.

It accepts two or more selectors as positional arguments and returns elements that match **any** of them.

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import ClassSelector

soup = BeautifulSoup(
    """
        <p class="title">Animal Farm</p>
        <span class="discount">
            <p class="price">Price: $25</p>
        </span>
        <p class="title">Brave New World</p>
        <p class="price">Price: $15</p>
    """,
    features="lxml",
)

title_selector = ClassSelector("title")
price_selector = ClassSelector("price")
discount_price_selector = title_selector + ClassSelector("discount") > price_selector
standard_price_selector = title_selector + price_selector

selector = discount_price_selector | standard_price_selector
selector.find_all(soup)

The `SelectorList` in `soupsavvy` is unique in that it functions both as a combinator (due to its CSS-like use case) and as a logical selector (because of its general logic). Unlike other combinators, the order of selectors within a `SelectorList` is not significant; changing the order will not affect the results.

In [None]:
from soupsavvy import ClassSelector, SelectorList

discount_selector = ClassSelector("discount")
price_selector = ClassSelector("price")

print(
    SelectorList(discount_selector, price_selector)
    == SelectorList(price_selector, discount_selector)
)

Additionally, two `SelectorList` instances can be considered equal even if they contain a different number of selectors, as long as they match the same elements.

In [None]:
from soupsavvy import ClassSelector, SelectorList, AttributeSelector

discount_selector = ClassSelector("discount")
price_selector = ClassSelector("price")
another_price_selector = AttributeSelector("class", value="price")

print(
    SelectorList(discount_selector, price_selector)
    == SelectorList(discount_selector, price_selector, another_price_selector)
)

`SelectorList` can also be created using `soupsavvy` API shortcut functions. These functions—`is_`, `or_`, and `where_`—provide alternative ways to instantiate a `SelectorList`. All these functions are equivalent and can be used interchangeably, depending on the context or the developer's preference.

For example, in CSS:

```css
:is(h1, h2)
:where(h1, h2)
```

Both `:is` and `:where` match all `<h1>` and `<h2>` elements. In `soupsavvy`, the `is_` and `where_` functions mimic these pseudo-classes. While they differ in terms of specificity in CSS, here, they are interchangeable. `or_` function serves as a more intuitive alias, especially when using `SelectorList` in a logical selector context.

For more information about these CSS pseudo-classes, refer to [:is()](https://developer.mozilla.org/en-US/docs/Web/CSS/:is) and [:where()](https://developer.mozilla.org/en-US/docs/Web/CSS/:where).

### NotSelector

`NotSelector` in `soupsavvy` corresponds to the CSS `:not()` pseudo-class. In CSS, `:not()` is used to exclude elements that match a certain selector. It allows you to apply styles to elements that do not match the specified criteria. For example:

```css
:not(.discount)
```

This selector matches all  elements except those that have the class `discount`.

In `soupsavvy`, `NotSelector` provides similar functionality. It accepts one or more selectors as arguments and returns elements that do **not** match **any** of the specified selectors.

For further details, refer to [Mozilla](https://developer.mozilla.org/en-US/docs/Web/CSS/:not).

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import ClassSelector, NotSelector

soup = BeautifulSoup(
    """
        <p class="price discount">Price: €10</p>
        <p class="price">Price: $20</p>
        <p class="price">Price: €15</p>
    """,
    features="html.parser",
)

discount_selector = ClassSelector("discount")
selector = NotSelector(discount_selector)
selector.find(soup)

More concise way to create `NotSelector` is by using the `~` operator:

```python
NotSelector(left) == ~selector
```
In this case, the element must not match the given selector to be selected.

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import ClassSelector

soup = BeautifulSoup(
    """
        <p class="price discount">Price: €10</p>
        <p class="price">Price: $20</p>
        <p class="price">Price: €15</p>
    """,
    features="html.parser",
)

discount_selector = ClassSelector("discount")
selector = ~discount_selector
selector.find(soup)

It accepts two or more selectors as positional arguments and returns elements that do **not** match **any** of them.  Since the `~` operator accepts only one operand, you can use `SelectorList` to negate multiple selectors:

```python
NotSelector(left, right) == ~(left | right)
```

This is equivalent to the logical expression:

```python
if not any([left, right]):
    ...
```

In [None]:
import re

from bs4 import BeautifulSoup

from soupsavvy import (
    ClassSelector,
    NotSelector,
    PatternSelector,
)

soup = BeautifulSoup(
    """
        <p class="price discount">Price: €10</p>
        <p class="price">Price: $20</p>
        <p class="price">Price: €15</p>
    """,
    features="html.parser",
)

discount_selector = ClassSelector("discount")
dollars_selector = PatternSelector(re.compile(r"\$\d+"))
selector = NotSelector(discount_selector, dollars_selector)
selector.find(soup)

In [None]:
import re

from bs4 import BeautifulSoup

from soupsavvy import (
    ClassSelector,
    PatternSelector,
)

soup = BeautifulSoup(
    """
        <p class="price discount">Price: €10</p>
        <p class="price">Price: $20</p>
        <p class="price">Price: €15</p>
    """,
    features="html.parser",
)

discount_selector = ClassSelector("discount")
dollars_selector = PatternSelector(re.compile(r"\$\d+"))
selector = ~(discount_selector | dollars_selector)
selector.find(soup)

### XORSelector

`XORSelector` is `soupsavvy`'s counterpart of the logical XOR operation, where an element is selected only if it matches exactly one of the provided selectors, but not more than one.

In CSS, there's no direct equivalent to `XORSelector`, but this can be achieved through selector list with compound selectors and `:not()` pseudo-class. For example this CSS:

```css 
span:not(.discount), .discount:not(span)
```

Matches all `<span>` elements that do not have the class `discount` and all elements with the class `discount` that are not `<span>` elements.

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import ClassSelector, TypeSelector, XORSelector

soup = BeautifulSoup(
    """
        <span class="discount">Buy!</span>
        <p class="price">Price: $10</p>
        <span class="price">Price: $20</span>
        <p class="discount">Price: $30</p>
    """,
    features="lxml",
)

discount_selector = ClassSelector("discount")
span_selector = TypeSelector("span")
selector = XORSelector(discount_selector, span_selector)
selector.find_all(soup)

This logic could be achieved by combining `SelectorList` and `NotSelector` in `soupsavvy`.

In [None]:
from soupsavvy import ClassSelector, TypeSelector

discount_selector = ClassSelector("discount")
span_selector = TypeSelector("span")

xor = (discount_selector & ~span_selector) | (~discount_selector & span_selector)

However, it is not scalable in case of having multiple selectors. `XORSelector` provides a more concise and readable way to express this logic. It accept two or more selectors as positional arguments and returns elements that match **exactly one** of the provided selectors.

In [None]:
import re

from bs4 import BeautifulSoup

from soupsavvy import ClassSelector, PatternSelector, TypeSelector, XORSelector

soup = BeautifulSoup(
    """
        <span class="discount">Buy!</span>
        <p class="price">Price: $10</p>
        <span class="price">Price: $20</span>
        <p class="discount">Price: $30</p>
        <p class="discount">Only Today: $8</p>
        <span>Only Today: $7</span>
        <p>Only Today: $6</p>
    """,
    features="lxml",
)

discount_selector = ClassSelector("discount")
span_selector = TypeSelector("span")
only_today_selector = PatternSelector(re.compile("Only Today:"))
selector = XORSelector(discount_selector, span_selector, only_today_selector)
selector.find_all(soup)

## Relative Selectors

## HasSelector

```css
div:has(p)
```

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import ClassSelector, HasSelector

soup = BeautifulSoup(
    """
    <div class="book">
        <span class="title">Brave New World</span>
        <p class="price">Price: $20</p>
    </div>
    <div class="book">
        <span class="title">Animal Farm</span>
        <span class="discount">
            <p class="price">Price: $15</p>
        </span>
        <p class="price">Price: $20</p>
    </div>
    """,
    features="html.parser",
)

discount_selector = ClassSelector("discount")
selector = HasSelector(discount_selector)
selector.find(soup)

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import ClassSelector, HasSelector, PatternSelector

soup = BeautifulSoup(
    """
    <div class="book">
        <span class="title">Brave New World</span>
        <p class="price">Price: $20</p>
    </div>
    <div class="book">
        <span class="title">Animal Farm</span>
        <span class="discount">
            <p class="price">Price: $15</p>
        </span>
        <p class="price">Price: $20</p>
    </div>
    <div class="book">
        <span>Bestseller</span>
        <span class="title">Frankenstein</span>
        <p class="price">Price: $30</p>
    </div>
    """,
    features="html.parser",
)

discount_selector = ClassSelector("discount")
bestseller_selector = PatternSelector("Bestseller")
selector = HasSelector(discount_selector, bestseller_selector)
selector.find_all(soup)

In [None]:
from bs4 import BeautifulSoup

from soupsavvy.selectors.relative import RelativeChild
from soupsavvy import AttributeSelector

soup = BeautifulSoup(
    """
        <p class="title">Animal Farm</p>
        <p class="discount">Price: $20</p>
        <p class="price">Price: $30</p>
    """,
    features="html.parser",
)

selector = AttributeSelector("class", value="price")
relative = RelativeChild(selector)
relative.find(soup)

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import AttributeSelector, Anchor

soup = BeautifulSoup(
    """
        <p class="title">Animal Farm</p>
        <p class="discount">Price: $20</p>
        <p class="price">Price: $30</p>
    """,
    features="html.parser",
)

selector = AttributeSelector("class", value="price")
relative = Anchor > selector
print(type(relative))
relative.find(soup)

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import AttributeSelector
from soupsavvy.selectors.relative import RelativeChild

soup = BeautifulSoup(
    """
        <p class="title">Animal Farm</p>
        <p class="discount">Price: $20</p>
        <p class="price">Price: $30</p>
        <div>
            <p class="price">Price: $40</p>
        </div>
    """,
    features="html.parser",
)

selector = AttributeSelector("class", value="price")
relative = RelativeChild(selector)

print(relative.find_all(soup, recursive=False))
print(relative.find_all(soup, recursive=True))

In [None]:
from bs4 import BeautifulSoup

from soupsavvy.selectors.relative import RelativeNextSibling
from soupsavvy import AttributeSelector

soup = BeautifulSoup(
    """
        <p class="title">Animal Farm</p>
        <div class="section">Book 1</div>
        <p class="price">Price: $30</p>
        <p class="discount">Price: $20</p>
        <p class="price">Price: $10</p>
    """,
    features="lxml",
)

selector = AttributeSelector("class", value="price")
relative = RelativeNextSibling(selector)
relative.find_all(soup.div)  # type: ignore

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import AttributeSelector, Anchor

soup = BeautifulSoup(
    """
        <p class="title">Animal Farm</p>
        <p class="discount">Price: $20</p>
        <p class="price">Price: $30</p>
        <div>
            <p class="price">Price: $40</p>
            <a href="/shop">Buy Now</a>
        </div>
    """,
    features="html.parser",
)

price_selector = AttributeSelector("class", value="price")
shop_selector = AttributeSelector("href", value="/shop")
and_selector = (Anchor > price_selector) | shop_selector
and_selector.find_all(soup, recursive=True)

## RelativeSelectors with HasSelector

In [None]:
from bs4 import BeautifulSoup

from soupsavvy.selectors.relative import RelativeNextSibling
from soupsavvy import AttributeSelector, HasSelector

soup = BeautifulSoup(
    """
        <p class="title">Animal Farm</p>
        <div class="section">Book 1</div>
        <p class="price">Price: $30</p>
        <p class="discount">Price: $20</p>
        <p class="price">Price: $10</p>
    """,
    features="lxml",
)

selector = AttributeSelector("class", value="price")
relative = RelativeNextSibling(selector)
has_selector = HasSelector(relative)
has_selector.find_all(soup)

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import AttributeSelector, HasSelector, Anchor, TypeSelector

soup = BeautifulSoup(
    """
        <p class="title">Animal Farm</p>
        <div class="section">Book 1</div>
        <p class="price">Price: $30</p>
        <p class="discount">Price: $20</p>
        <p class="price">Price: $10</p>
    """,
    features="lxml",
)

relative = Anchor + AttributeSelector("class", value="price")
has_selector = HasSelector(relative) & TypeSelector("div")
has_selector.find_all(soup)

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import AttributeSelector, HasSelector, Anchor, PatternSelector

soup = BeautifulSoup(
    """
        <div class="newspaper">
            <p class="title">New York Times</p>
            <h1>Only 5$!</h1>
        </div>
        <p>Read more</p>
        <div class="book">
            <span>Animal Farm</span>
            <p class="price">Price: $30</p>
        </div>
    """,
    features="lxml",
)

relative_newspaper = Anchor + PatternSelector("Read more")
relative_book = Anchor > AttributeSelector("class", value="price")
has_selector = HasSelector(relative_newspaper, relative_book)
has_selector.find_all(soup)

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import HasSelector, Anchor, PatternSelector

soup = BeautifulSoup(
    """
        <div class="countdown">4</div>
        <span>3</span>
        <p id="fgai23">2</p>
        <a href="https://example.com">1</a>
        <span>Stop</span>
        <p>Go</p>
    """,
    features="lxml",
)

relative_countdown = Anchor * PatternSelector("Stop")
has_selector = HasSelector(relative_countdown)
has_selector.find_all(soup)

## NthOfSelector

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import ClassSelector
from soupsavvy.selectors.nth import NthOfSelector

soup = BeautifulSoup(
    """
        <span class="title">Animal Farm</span>
        <p class="price discount">Price: €10</p>
        <p class="price">Price: $20</p>
        <span>Bestseller</span>
        <p class="price">Price: €15</p>
        <p class="price">Price: €25</p>
        <p class="price">Price: €17</p>
    """,
    features="lxml",
)

price_selector = ClassSelector("price")
selector = NthOfSelector(price_selector, nth="2n")
selector.find_all(soup)

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import ClassSelector
from soupsavvy.selectors.nth import NthLastOfSelector

soup = BeautifulSoup(
    """
        <span class="title">Animal Farm</span>
        <p class="price discount">Price: €10</p>
        <p class="price">Price: $20</p>
        <span>Bestseller</span>
        <p class="price">Price: €15</p>
        <p class="price">Price: €25</p>
        <p class="price">Price: €17</p>
    """,
    features="lxml",
)

price_selector = ClassSelector("price")
selector = NthLastOfSelector(price_selector, nth="odd")
selector.find_all(soup)

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import ClassSelector, HasSelector
from soupsavvy.selectors.nth import OnlyOfSelector

soup = BeautifulSoup(
    """
    <div class="book">
        <span class="title">Animal Farm</span>
        <p class="price">Price: $15</p>
        <p class="price">Price: $20</p>
    </div>
    <div class="book">
        <span class="title">Frankenstein</span>
        <p class="price">Price: $30</p>
    </div>
    """,
    features="html.parser",
)
price_selector = ClassSelector("price")
selector = HasSelector(OnlyOfSelector(price_selector))
selector.find(soup)

In [None]:
from bs4 import BeautifulSoup

from soupsavvy import ClassSelector, HasSelector
from soupsavvy.selectors.nth import OnlyOfSelector

soup = BeautifulSoup(
    """
    <div class="book">
        <span class="title">Frankenstein</span>
        <p class="price">Price: $30</p>
    </div>
    <div class="book">
        <span class="title">Animal Farm</span>
        <p class="price">Price: $15</p>
        <p class="price">Price: $20</p>
    </div>
    <div class="book">
        <span class="title">Brave New world</span>
        <p class="price">Price: $25</p>
        <p class="price">Price: $20</p>
    </div>
    """,
    features="html.parser",
)
price_selector = ClassSelector("price")
selector = ~HasSelector(OnlyOfSelector(price_selector)) > ClassSelector("title")
selector.find_all(soup, recursive=False)

## Overview