Browser: Search Page #46

cookiengineer · 2020-10-22T16:35:29Z

The stealth:search Page needs an Online and Offline search integration.

For now, the following search engines seem promising when it comes to their APIs that do not require tokens and/or user-specific authentication information in order to use them:

wiby.me can be integrated with a simple JSON request to https://wiby.me/json/?q=key%20words&o=15 whereas the first result page doesn't need an o=... parameter. The results are returned back in batches of 15 results.
searx.me (and all instances) has actually a very nice API that's documented well [1] and also allows json as a response format via https://searx.xyz/search?q=key%20words&format=json. The results are returned in pages and the pageno parameter accepts 1 or higher numbers. But, if no results are returned, the JSON is basically an empty array. There's seemingly no way to find out whether or not page 1 includes all found results or not.
searx integration might need something like an engines list that is a comma separated parameter in the request url. The list is pretty huge, but is also documented [2]
The Web Archive API is currently totally unclear, because there seems to be only outdated information about it. This probably needs some investigation about the source code that's being used on web.archive.org.

[1] Search API
[2] Search Engines

The text was updated successfully, but these errors were encountered:

cookiengineer · 2020-10-22T16:39:07Z

The list of searx instances (that is available on https://searx.space) is also available as a json file under the URL https://searx.space/data/instances.json.

2075 · 2020-10-22T16:49:46Z

i did not know about searx, awesome! should be integrated on OS level

cookiengineer · 2020-10-25T23:51:55Z

After investigating this for two days, the Search Page has been implemented in a rudimentary manner.

The Web Archive's advancedsearch.php does not allow to search for keywords, only for specific URLs. The normal search.php would theoretically support a keyword search, but can only return multiple MB of HTML code. So for now, on the Search Page, the Web Archive API is useless.

The wiby.me API seems to be rate-limited and doesn't accept requests without a faked User-Agent string, which seems kind of weird. This needs some further investigation in future, but for now the searx-integrated results are good enough.

However, the redirect of https://web.archive.org/*/<complete url> can be easily used to identify whether or not there's a web archived version of the page available. The issue for this is #19 (stealth:fix-request Page).

cookiengineer added this to the X0 - Codename Spirit milestone Oct 22, 2020

cookiengineer closed this as completed Oct 25, 2020

cookiengineer self-assigned this Oct 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Browser: Search Page #46

Browser: Search Page #46

cookiengineer commented Oct 22, 2020 •

edited

cookiengineer commented Oct 22, 2020

2075 commented Oct 22, 2020

cookiengineer commented Oct 25, 2020 •

edited

Browser: Search Page #46

Browser: Search Page #46

Comments

cookiengineer commented Oct 22, 2020 • edited

cookiengineer commented Oct 22, 2020

2075 commented Oct 22, 2020

cookiengineer commented Oct 25, 2020 • edited

cookiengineer commented Oct 22, 2020 •

edited

cookiengineer commented Oct 25, 2020 •

edited