-
Notifications
You must be signed in to change notification settings - Fork 745
Description
Issue: When using the Jina Reader API (r.jina.ai) to process the URL https://www.foerderdatenbank.de/FDB/Content/DE/Foerderprogramm/Land/Schleswig-Holstein/soziale-wohnraumfoerderung-eigentumsmassnahmen.html, the content within the element identified by the CSS selector body > main > div.jumbotron > div:nth-child(2) > div is not included in the output. Only the bottom part of the website's content appears to be read out by the API.
Steps to Reproduce:
-
Execute the following
curlcommand:Bash
curl "https://r.jina.ai/https://www.foerderdatenbank.de/FDB/Content/DE/Foerderprogramm/Land/Schleswig-Holstein/soziale-wohnraumfoerderung-eigentumsmassnahmen.html"\ -H "Authorization: Bearer YOUR_JINA_API_TOKEN"\ -H "X-Wait-For-Selector: body, .class, #id"(Note: Replace
YOUR_JINA_API_TOKENwith your actual API token and ensure theX-Wait-For-Selectorvalue is exactly as shown, although this header's value might not be directly related to the issue of missing content in a specific included part of the page).
Expected Behavior:
The Jina Reader API should process and return the content from the entire main body of the specified webpage, including the section identified by the selector body > main > div.jumbotron > div:nth-child(2) > div.
Actual Behavior:
The content from the element with the CSS selector body > main > div.jumbotron > div:nth-child(2) > div is omitted from the API's output. Only the lower portion of the website's content is returned.