Extend Crawler queries by a custom "data-orama" attribute #722

fabiobiondi · 2024-05-14T16:28:48Z

Problem Description

We are trying the Crawler and and we noticed that our Next 14 site is not being indexed.

The problem is probably that we have many nested components that render texts inside <div> instead of <p>.
I realize that it's not the best in terms of accessibility and semantics but we have this need.

Looking at the source code (general-purpose.ts) we realized that the contents of the <div>s are totally ignored.

https://github.com/askorama/crawly/blob/2892e473775a408495d07a0dea016ec23a85d362/src/general-purpose.ts#L34-L51

In fact I and @gioboa did a test modifying your function, adding <div>s to the query, but dirt and non-useful DOM elements were also indexed. So it doesn't seem like a decent solution.

Proposed Solution

We thought an interesting idea might be to let users decide what content to index outside of your rules.

A very simple hypothetical solution could be to insert a data-orama attribute on the elements to be indexed into the site you want to index and extend the crawler to also query those elements.

<div data-orama> content </div>

I think it might be a simple, clean and powerful way to extend it.

What do you think?

Alternatives

Another future solution could be to allow the crawler function to be completely customized by the users

Additional Context

No response

The text was updated successfully, but these errors were encountered:

SaraVieira · 2024-05-31T11:45:29Z

hey!

This is a great idea! I made a pr to the repo to add custom selectors and will ping when this is merged and these options are also added on the website

gioboa · 2024-05-31T11:47:00Z

Thanks @SaraVieira

SaraVieira mentioned this issue May 31, 2024

Add possibility for custom selectors askorama/crawly#1

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend Crawler queries by a custom "data-orama" attribute #722

Extend Crawler queries by a custom "data-orama" attribute #722

fabiobiondi commented May 14, 2024 •

edited

SaraVieira commented May 31, 2024

gioboa commented May 31, 2024

Extend Crawler queries by a custom "data-orama" attribute #722

Extend Crawler queries by a custom "data-orama" attribute #722

Comments

fabiobiondi commented May 14, 2024 • edited

Problem Description

Proposed Solution

Alternatives

Additional Context

SaraVieira commented May 31, 2024

gioboa commented May 31, 2024

fabiobiondi commented May 14, 2024 •

edited