Skip to content

Commit

Permalink
feat(selectors): has-text pseudo-class (#5120)
Browse files Browse the repository at this point in the history
This pseudo-class matches approximately when
`element.textContent.includes(textToSearchFor)`.
  • Loading branch information
dgozman committed Jan 25, 2021
1 parent 77b5f05 commit 894abbf
Show file tree
Hide file tree
Showing 4 changed files with 166 additions and 21 deletions.
76 changes: 60 additions & 16 deletions docs/src/selectors.md
Expand Up @@ -134,44 +134,88 @@ selectors in a more compact form.

```js
// Clicks a <button> that has either a "Log in" or "Sign in" text.
await page.click('button:is(:text("Log in"), :text("Sign in"))');
await page.click(':is(button:has-text("Log in"), button:has-text("Sign in"))');
```

```python async
# Clicks a <button> that has either a "Log in" or "Sign in" text.
await page.click('button:is(:text("Log in"), :text("Sign in"))')
await page.click(':is(button:has-text("Log in"), button:has-text("Sign in"))')
```

```python sync
# Clicks a <button> that has either a "Log in" or "Sign in" text.
page.click('button:is(:text("Log in"), :text("Sign in"))')
page.click(':is(button:has-text("Log in"), button:has-text("Sign in"))')
```

## Selecting elements by text

The `:text` pseudo-class matches elements that have a text node child with specific text.
It is similar to the [text] engine, but can be used in combination with other `css` selector extensions.
There are a few variations that support different arguments:
The `:has-text` pseudo-class matches elements that have specific text somewhere inside, possibly in a child or a descendant element. It is approximately equivalent to `element.textContent.includes(textToSearchFor)`.

* `:text("substring")` - Matches when element's text contains "substring" somewhere. Matching is case-insensitive. Matching also normalizes whitespace, for example it turns multiple spaces into one, turns line breaks into spaces and ignores leading and trailing whitespace.
* `:text-is("string")` - Matches when element's text equals the "string". Matching is case-insensitive and normalizes whitespace.
* `button:text("Sign in")` - Text selector may be combined with regular CSS.
* `:text-matches("[+-]?\\d+")` - Matches text against a regular expression. Note that special characters like back-slash `\`, quotes `"`, square brackets `[]` and more should be escaped. Learn more about [regular expressions](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp).
* `:text-matches("value", "i")` - Matches text against a regular expression with specified flags.
The `:text` pseudo-class matches elements that have a text node child with specific text. It is similar to the [text] engine.

Click a button with text "Sign in":
`:has-text` and `:text` should be used differently. Consider the following page:
```html
<div class=nav-item>Home</div>
<div class=nav-item>
<span class=bold>New</span> products
</div>
<div class=nav-item>
<span class=bold>All</span> products
</div>
<div class=nav-item>Contact us</div>
```

Use `:has-text()` to click a navigation item that contains text "All products".
```js
await page.click('button:text("Sign in")');
await page.click('.nav-item:has-text("All products")');
```
```python async
await page.click('.nav-item:has-text("All products")')
```
```python sync
page.click('.nav-item:has-text("All products")')
```
`:has-text()` will match even though "All products" text is split between multiple elements. However, it will also match any parent element of this navigation item, including `<body>` and `<html>`, because each of them contains "All products" somewhere inside. Therefore, `:has-text()` must be used together with other `css` specifiers, like a tag name or a class name.
```js
// Wrong, will match many elements including <body>
await page.click(':has-text("All products")');
// Correct, only matches the navigation item
await page.click('.nav-item:has-text("All products")');
```

```python async
await page.click('button:text("Sign in")')
# Wrong, will match many elements including <body>
await page.click(':has-text("All products")')
# Correct, only matches the navigation item
await page.click('.nav-item:has-text("All products")')
```
```python sync
# Wrong, will match many elements including <body>
page.click(':has-text("All products")')
# Correct, only matches the navigation item
page.click('.nav-item:has-text("All products")')
```

Use `:text()` to click an element that directly contains text "Home".
```js
await page.click(':text("Home")');
```
```python async
await page.click(':text("Home")')
```
```python sync
page.click('button:text("Sign in")')
page.click(':text("Home")')
```
`:text()` only matches the element that contains the text directly inside, but not any parent elements. It is suitable to use without other `css` specifiers. However, it does not match text across elements. For example, `:text("All products")` will not match anything, because "All" and "products" belong to the different elements.

:::note
Both `:has-text()` and `:text()` perform case-insensitive match. They also normalize whitespace, for example turn multiple spaces into one, turn line breaks into spaces and ignore leading and trailing whitespace.
:::

There are a few `:text()` variations that support different arguments:
* `:text("substring")` - Matches when a text node inside the element contains "substring". Matching is case-insensitive and normalizes whitespace.
* `:text-is("string")` - Matches when all text nodes inside the element combined have the text value equal to "string". Matching is case-insensitive and normalizes whitespace.
* `:text-matches("[+-]?\\d+")` - Matches text nodes against a regular expression. Note that special characters like back-slash `\`, quotes `"`, square brackets `[]` and more should be escaped. Learn more about [regular expressions](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp).
* `:text-matches("value", "i")` - Matches text nodes against a regular expression with specified flags.

## Selecting elements in Shadow DOM

Expand Down
2 changes: 1 addition & 1 deletion src/server/common/selectorParser.ts
Expand Up @@ -26,7 +26,7 @@ export type ParsedSelector = {
capture?: number,
};

export const customCSSNames = new Set(['not', 'is', 'where', 'has', 'scope', 'light', 'visible', 'text', 'text-matches', 'text-is', 'above', 'below', 'right-of', 'left-of', 'near', 'nth-match']);
export const customCSSNames = new Set(['not', 'is', 'where', 'has', 'scope', 'light', 'visible', 'text', 'text-matches', 'text-is', 'has-text', 'above', 'below', 'right-of', 'left-of', 'near', 'nth-match']);

export function parseSelector(selector: string): ParsedSelector {
const result = parseSelectorV1(selector);
Expand Down
22 changes: 19 additions & 3 deletions src/server/injected/selectorEvaluator.ts
Expand Up @@ -58,6 +58,7 @@ export class SelectorEvaluatorImpl implements SelectorEvaluator {
this._engines.set('text', textEngine);
this._engines.set('text-is', textIsEngine);
this._engines.set('text-matches', textMatchesEngine);
this._engines.set('has-text', hasTextEngine);
this._engines.set('right-of', createPositionEngine('right-of', boxRightOf));
this._engines.set('left-of', createPositionEngine('left-of', boxLeftOf));
this._engines.set('above', createPositionEngine('above', boxAbove));
Expand Down Expand Up @@ -408,15 +409,15 @@ const visibleEngine: SelectorEngine = {

const textEngine: SelectorEngine = {
matches(element: Element, args: (string | number | Selector)[], context: QueryContext, evaluator: SelectorEvaluator): boolean {
if (args.length === 0 || typeof args[0] !== 'string')
if (args.length !== 1 || typeof args[0] !== 'string')
throw new Error(`"text" engine expects a single string`);
return elementMatchesText(element, context, textMatcher(args[0], true));
},
};

const textIsEngine: SelectorEngine = {
matches(element: Element, args: (string | number | Selector)[], context: QueryContext, evaluator: SelectorEvaluator): boolean {
if (args.length === 0 || typeof args[0] !== 'string')
if (args.length !== 1 || typeof args[0] !== 'string')
throw new Error(`"text-is" engine expects a single string`);
return elementMatchesText(element, context, textMatcher(args[0], false));
},
Expand All @@ -431,6 +432,17 @@ const textMatchesEngine: SelectorEngine = {
},
};

const hasTextEngine: SelectorEngine = {
matches(element: Element, args: (string | number | Selector)[], context: QueryContext, evaluator: SelectorEvaluator): boolean {
if (args.length !== 1 || typeof args[0] !== 'string')
throw new Error(`"has-text" engine expects a single string`);
if (shouldSkipForTextMatching(element))
return false;
const matcher = textMatcher(args[0], true);
return matcher(element.textContent || '');
},
};

function textMatcher(text: string, substring: boolean): (s: string) => boolean {
text = text.trim().replace(/\s+/g, ' ');
text = text.toLowerCase();
Expand All @@ -441,8 +453,12 @@ function textMatcher(text: string, substring: boolean): (s: string) => boolean {
};
}

function shouldSkipForTextMatching(element: Element) {
return element.nodeName === 'SCRIPT' || element.nodeName === 'STYLE' || document.head && document.head.contains(element);
}

function elementMatchesText(element: Element, context: QueryContext, matcher: (s: string) => boolean) {
if (element.nodeName === 'SCRIPT' || element.nodeName === 'STYLE' || document.head && document.head.contains(element))
if (shouldSkipForTextMatching(element))
return false;
if ((element instanceof HTMLInputElement) && (element.type === 'submit' || element.type === 'button') && matcher(element.value))
return true;
Expand Down
87 changes: 86 additions & 1 deletion test/selectors-text.spec.ts
Expand Up @@ -106,7 +106,7 @@ it('should work', async ({page}) => {
expect((await page.$$(`text="lo wo"`)).length).toBe(0);
});

it('should work in v2', async ({page}) => {
it('should work with :text', async ({page}) => {
await page.setContent(`<div>yo</div><div>ya</div><div>\nHELLO \n world </div>`);
expect(await page.$eval(`:text("ya")`, e => e.outerHTML)).toBe('<div>ya</div>');
expect(await page.$eval(`:text-is("ya")`, e => e.outerHTML)).toBe('<div>ya</div>');
Expand All @@ -120,6 +120,91 @@ it('should work in v2', async ({page}) => {
expect(await page.$eval(`:text-matches("y", "g")`, e => e.outerHTML)).toBe('<div>yo</div>');
expect(await page.$eval(`:text-matches("Y", "i")`, e => e.outerHTML)).toBe('<div>yo</div>');
expect(await page.$(`:text-matches("^y$")`)).toBe(null);

const error1 = await page.$(`:text("foo", "bar")`).catch(e => e);
expect(error1.message).toContain(`"text" engine expects a single string`);
const error2 = await page.$(`:text(foo > bar)`).catch(e => e);
expect(error2.message).toContain(`"text" engine expects a single string`);
});

it('should work with :has-text', async ({page}) => {
await page.setContent(`
<input id=input2>
<div id=div1>
<span> Find me </span>
or
<wrap><span id=span2>maybe me </span></wrap>
<div><input id=input1></div>
</div>
`);
expect(await page.$eval(`:has-text("find me")`, e => e.tagName)).toBe('HTML');
expect(await page.$eval(`span:has-text("find me")`, e => e.outerHTML)).toBe('<span> Find me </span>');
expect(await page.$eval(`div:has-text("find me")`, e => e.id)).toBe('div1');
expect(await page.$eval(`div:has-text("find me") input`, e => e.id)).toBe('input1');
expect(await page.$eval(`:has-text("find me") input`, e => e.id)).toBe('input2');
expect(await page.$eval(`div:has-text("find me or maybe me")`, e => e.id)).toBe('div1');
expect(await page.$(`div:has-text("find noone")`)).toBe(null);
expect(await page.$$eval(`:is(div,span):has-text("maybe")`, els => els.map(e => e.id).join(';'))).toBe('div1;span2');
expect(await page.$eval(`div:has-text("find me") :has-text("maybe me")`, e => e.tagName)).toBe('WRAP');
expect(await page.$eval(`div:has-text("find me") span:has-text("maybe me")`, e => e.id)).toBe('span2');

const error1 = await page.$(`:has-text("foo", "bar")`).catch(e => e);
expect(error1.message).toContain(`"has-text" engine expects a single string`);
const error2 = await page.$(`:has-text(foo > bar)`).catch(e => e);
expect(error2.message).toContain(`"has-text" engine expects a single string`);
});

it(':text and :has-text should work with large DOM', async ({page}) => {
await page.evaluate(() => {
let id = 0;
const next = (tag: string) => {
const e = document.createElement(tag);
const eid = ++id;
e.textContent = 'id' + eid;
e.id = 'id' + eid;
return e;
};
const generate = (depth: number) => {
const div = next('div');
const span1 = next('span');
const span2 = next('span');
div.appendChild(span1);
div.appendChild(span2);
if (depth > 0) {
div.appendChild(generate(depth - 1));
div.appendChild(generate(depth - 1));
}
return div;
};
document.body.appendChild(generate(12));
});
const selectors = [
':has-text("id18")',
':has-text("id12345")',
':has-text("id")',
':text("id18")',
':text("id12345")',
':text("id")',
'#id18',
'#id12345',
'*',
];

const measure = false;
for (const selector of selectors) {
const time1 = Date.now();
for (let i = 0; i < (measure ? 10 : 1); i++)
await page.$$eval(selector, els => els.length);
if (measure)
console.log(`pw("${selector}"): ` + (Date.now() - time1));

if (measure && !selector.includes('text')) {
const time2 = Date.now();
for (let i = 0; i < (measure ? 10 : 1); i++)
await page.evaluate(selector => document.querySelectorAll(selector).length, selector);
console.log(`qs("${selector}"): ` + (Date.now() - time2));
}
}
});

it('should be case sensitive if quotes are specified', async ({page}) => {
Expand Down

0 comments on commit 894abbf

Please sign in to comment.