New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

list: Search Engines [for Wiki] #118

Closed
Thorin-Oakenpants opened this Issue May 15, 2017 · 33 comments

Comments

Projects
None yet
8 participants
@Thorin-Oakenpants
Member

Thorin-Oakenpants commented May 15, 2017

Please add any search engines to add to the wiki list. For each search engine we decide to include, we will see if we can find a decent AMO version, and we will also provide sanitized XMLs later on.

As suggestions are accepted, I'll edit this list in the first post.


Intro

Someone write an intro

Recommended

webpages for perusal, we could expand some info on each engine

Other [Sanitized]

  • Google.com nc (no country) [Do we provide a sanitized google? We would make it clear its not a privacy respecting engine at all and list it separately]

Anyway, get cracking and I'll type it up

@Atavic

This comment has been minimized.

Show comment
Hide comment
@Atavic

Atavic May 15, 2017

Collaborator

I had a look at some engines with this search string:ghacks-user.js
Both Startpage and ixquick connect to the same routit.net address and the results are almost the same.

Neither https://www.google.com/ncr nor http://www.google.com/ncr work for me anymore.

Collaborator

Atavic commented May 15, 2017

I had a look at some engines with this search string:ghacks-user.js
Both Startpage and ixquick connect to the same routit.net address and the results are almost the same.

Neither https://www.google.com/ncr nor http://www.google.com/ncr work for me anymore.

@Thorin-Oakenpants

This comment has been minimized.

Show comment
Hide comment
@Thorin-Oakenpants

Thorin-Oakenpants May 15, 2017

Member

Neither https://www.google.com/ncr nor http://www.google.com/ncr work for me anymore.

Me neither. http is dead bro, it will just redirect to https

Member

Thorin-Oakenpants commented May 15, 2017

Neither https://www.google.com/ncr nor http://www.google.com/ncr work for me anymore.

Me neither. http is dead bro, it will just redirect to https

@Thorin-Oakenpants

This comment has been minimized.

Show comment
Hide comment
@Thorin-Oakenpants

Thorin-Oakenpants May 15, 2017

Member

^^ Indeed .. sanitized means no one f**ks with it.

Member

Thorin-Oakenpants commented May 15, 2017

^^ Indeed .. sanitized means no one f**ks with it.

@RoxKilly

This comment has been minimized.

Show comment
Hide comment
@RoxKilly

RoxKilly May 15, 2017

Collaborator

If you're going down the path of supporting narrow engines, we might end up down a rabbit hole with no end in sight, and we risk adding a lot of overhead to the project because small engines are more likely to disappear from the web, breaking users' experience.

I use DuckDuckGo Lite, which does not have any JavaScript.

Collaborator

RoxKilly commented May 15, 2017

If you're going down the path of supporting narrow engines, we might end up down a rabbit hole with no end in sight, and we risk adding a lot of overhead to the project because small engines are more likely to disappear from the web, breaking users' experience.

I use DuckDuckGo Lite, which does not have any JavaScript.

@crssi

This comment has been minimized.

Show comment
Hide comment
@crssi
Collaborator

crssi commented May 15, 2017

@RoxKilly

This comment has been minimized.

Show comment
Hide comment
@RoxKilly

RoxKilly May 15, 2017

Collaborator

@crssi didn't know about that one. I'm having a hard time understanding what it does.

  • Clear searches from browser history? how? If so, I'm not that concerned because my search history gets cleared whenever my regular browsing history is cleared and I've already set it at a level I am ok with.

  • Use direct links for results? DuckDuckGo Lite does the same thing, and that one is authored by DuckDuckGo itself, not a 3rd party.

Collaborator

RoxKilly commented May 15, 2017

@crssi didn't know about that one. I'm having a hard time understanding what it does.

  • Clear searches from browser history? how? If so, I'm not that concerned because my search history gets cleared whenever my regular browsing history is cleared and I've already set it at a level I am ok with.

  • Use direct links for results? DuckDuckGo Lite does the same thing, and that one is authored by DuckDuckGo itself, not a 3rd party.

@Thorin-Oakenpants

This comment has been minimized.

Show comment
Hide comment
@Thorin-Oakenpants

Thorin-Oakenpants May 15, 2017

Member

@RoxKilly Yeah, I don't think we'll go down that path. I mean we could list Wikipedia, Twitter, Reddit, IMDB and leave it at that (sample made up list, don't bitch at me), as examples - pointing out that for frequently used sites it is best to use a site specific engine. A separate wiki page would list how to create your own sanitized versions for any site.

Member

Thorin-Oakenpants commented May 15, 2017

@RoxKilly Yeah, I don't think we'll go down that path. I mean we could list Wikipedia, Twitter, Reddit, IMDB and leave it at that (sample made up list, don't bitch at me), as examples - pointing out that for frequently used sites it is best to use a site specific engine. A separate wiki page would list how to create your own sanitized versions for any site.

@crssi

This comment has been minimized.

Show comment
Hide comment
@crssi

crssi May 15, 2017

Collaborator

Call me what you like :), but I have enabled live typing search results, and this one returns clean output without "strange looking" urls on the right side of result hit where most of others don't.

Collaborator

crssi commented May 15, 2017

Call me what you like :), but I have enabled live typing search results, and this one returns clean output without "strange looking" urls on the right side of result hit where most of others don't.

@Thorin-Oakenpants Thorin-Oakenpants referenced this issue May 15, 2017

Open

sticky: wiki stiki #65

3 of 6 tasks complete
@Atavic

This comment has been minimized.

Show comment
Hide comment
@Atavic

Atavic May 23, 2017

Collaborator

You can definitely use HTTP Method: GET

POST Method allows the use of parameter extension.

&abp=-1

Not using AdBlock Plus?

Collaborator

Atavic commented May 23, 2017

You can definitely use HTTP Method: GET

POST Method allows the use of parameter extension.

&abp=-1

Not using AdBlock Plus?

@Atavic

This comment has been minimized.

Show comment
Hide comment
@Atavic

Atavic May 26, 2017

Collaborator

Flame on this:


Intro

Most search engines try to harvest sensitive informations that go beyond the scope or providing results from keywords search. Some engines SAY they respect users privacy:


[INFO]: behavioral marketing ...it's half the page.

Related to: search engine privacy

Collaborator

Atavic commented May 26, 2017

Flame on this:


Intro

Most search engines try to harvest sensitive informations that go beyond the scope or providing results from keywords search. Some engines SAY they respect users privacy:


[INFO]: behavioral marketing ...it's half the page.

Related to: search engine privacy

@atomGit

This comment has been minimized.

Show comment
Hide comment
@atomGit

atomGit May 28, 2017

Collaborator

i'm workin' on an article regarding this - you can link to that, or publish it here in your wiki, or not use it all if you don't want :)

i am busy with other stuff atm, but i should get it done in a week or 2...

Collaborator

atomGit commented May 28, 2017

i'm workin' on an article regarding this - you can link to that, or publish it here in your wiki, or not use it all if you don't want :)

i am busy with other stuff atm, but i should get it done in a week or 2...

@earthlng earthlng referenced this issue Jun 21, 2017

Closed

ToDo: diffs FF54-FF55 #144

24 of 24 tasks complete
@earthlng

This comment has been minimized.

Show comment
Hide comment
@earthlng

earthlng Aug 15, 2017

Member

what's the diff between startpage and ixquick?

from http://securityspread.com/2016/10/24/duckduckgo-startpage-2016-update/

I use startpage.com in most of my examples but you can use ixquick.com as well. Everything mentioned in this article applies to ixquick.com too as the two sites have merged earlier this year. There is also ixquick.eu which returns results from search engines that are not Google

Member

earthlng commented Aug 15, 2017

what's the diff between startpage and ixquick?

from http://securityspread.com/2016/10/24/duckduckgo-startpage-2016-update/

I use startpage.com in most of my examples but you can use ixquick.com as well. Everything mentioned in this article applies to ixquick.com too as the two sites have merged earlier this year. There is also ixquick.eu which returns results from search engines that are not Google

@earthlng

This comment has been minimized.

Show comment
Hide comment
@earthlng

earthlng Aug 17, 2017

Member

uBO rules to block what seem to be tracking images on startpage and ixquick:

! tracking images on startpage and ixquick
||/do/avtc?$image,important,domain=ixquick.com|ixquick.eu|startpage.com
||/do/showimage?$image,important,domain=ixquick.com|ixquick.eu|startpage.com
||/english/web/$image,important,domain=ixquick.com|ixquick.eu|startpage.com
||/tix2/$image,important,domain=ixquick.com|ixquick.eu|startpage.com
Member

earthlng commented Aug 17, 2017

uBO rules to block what seem to be tracking images on startpage and ixquick:

! tracking images on startpage and ixquick
||/do/avtc?$image,important,domain=ixquick.com|ixquick.eu|startpage.com
||/do/showimage?$image,important,domain=ixquick.com|ixquick.eu|startpage.com
||/english/web/$image,important,domain=ixquick.com|ixquick.eu|startpage.com
||/tix2/$image,important,domain=ixquick.com|ixquick.eu|startpage.com
@Thorin-Oakenpants

This comment has been minimized.

Show comment
Hide comment
@Thorin-Oakenpants

Thorin-Oakenpants Aug 18, 2017

Member

^^ Added to the Wiki

Member

Thorin-Oakenpants commented Aug 18, 2017

^^ Added to the Wiki

@Thorin-Oakenpants

This comment has been minimized.

Show comment
Hide comment
@Thorin-Oakenpants

Thorin-Oakenpants Sep 9, 2017

Member

So I created 2 wiki entries for Search

  • 4.1 Search Engines
  • 4.2 Sanitizing

4.1: I suggest a TINY intro about using site specific engines (but am not going to provide any), followed by a similar setup to Extensions - break the list into three or four sections, iconize some things maybe (not too much), such as No JS required. Note items such as does own indexing or pulls results from A, B or C etc, privacy policy etc

So I guess we need to work out the attributes of each engine

4.2: a how to guide on sanitizing with ONE example - google.com

https://www.ghacks.net/2017/09/04/privacy-focused-search-engines-on-the-rise/#comment-4223372 - re https://www.findx.com

Member

Thorin-Oakenpants commented Sep 9, 2017

So I created 2 wiki entries for Search

  • 4.1 Search Engines
  • 4.2 Sanitizing

4.1: I suggest a TINY intro about using site specific engines (but am not going to provide any), followed by a similar setup to Extensions - break the list into three or four sections, iconize some things maybe (not too much), such as No JS required. Note items such as does own indexing or pulls results from A, B or C etc, privacy policy etc

So I guess we need to work out the attributes of each engine

4.2: a how to guide on sanitizing with ONE example - google.com

https://www.ghacks.net/2017/09/04/privacy-focused-search-engines-on-the-rise/#comment-4223372 - re https://www.findx.com

@Thorin-Oakenpants

This comment has been minimized.

Show comment
Hide comment
@Thorin-Oakenpants

Thorin-Oakenpants Sep 9, 2017

Member

well that counts them out I guess. Do we need to type up some kind of table to work all this out? And we can just stick in 's and 's

Member

Thorin-Oakenpants commented Sep 9, 2017

well that counts them out I guess. Do we need to type up some kind of table to work all this out? And we can just stick in 's and 's

@Atavic

This comment has been minimized.

Show comment
Hide comment
@Atavic

Atavic Sep 9, 2017

Collaborator

A Privacy-respecting-search-engine is like polished turd. Best option is to use an extension that clears all the variables added to your searches. I'm using yandex and without JS the results are clean.

Collaborator

Atavic commented Sep 9, 2017

A Privacy-respecting-search-engine is like polished turd. Best option is to use an extension that clears all the variables added to your searches. I'm using yandex and without JS the results are clean.

@Thorin-Oakenpants

This comment has been minimized.

Show comment
Hide comment
@Thorin-Oakenpants

This comment has been minimized.

Show comment
Hide comment
@Thorin-Oakenpants

Thorin-Oakenpants Dec 8, 2017

Member

we don't need this now - we are going to link to atomgit's articles on a single search engine wiki page and then recommend using only two engines: DDG and SearX - see #307

Member

Thorin-Oakenpants commented Dec 8, 2017

we don't need this now - we are going to link to atomgit's articles on a single search engine wiki page and then recommend using only two engines: DDG and SearX - see #307

@atomGit

This comment has been minimized.

Show comment
Hide comment
@atomGit

atomGit Dec 9, 2017

Collaborator

hi Pants - just so you know, i have some work to do on the search engine article - it doesn't entirely work for v57+

a commenter, the developer of the XML importer/exporter plug for the FF search engines, says that modifying the search scripts for v57+ has become more difficult - he offers a script to import/export since his plug won't work with v57+

read his comment if you want and if anyone has any input, let me know

it looks to me like Moz is really wanting to protect their source of revenue from the search engines by making it yet more difficult to modify the existing search scripts

Collaborator

atomGit commented Dec 9, 2017

hi Pants - just so you know, i have some work to do on the search engine article - it doesn't entirely work for v57+

a commenter, the developer of the XML importer/exporter plug for the FF search engines, says that modifying the search scripts for v57+ has become more difficult - he offers a script to import/export since his plug won't work with v57+

read his comment if you want and if anyone has any input, let me know

it looks to me like Moz is really wanting to protect their source of revenue from the search engines by making it yet more difficult to modify the existing search scripts

@Thorin-Oakenpants

This comment has been minimized.

Show comment
Hide comment
@Thorin-Oakenpants

Thorin-Oakenpants Dec 9, 2017

Member

yikes! I want to redo all my search engines too

Member

Thorin-Oakenpants commented Dec 9, 2017

yikes! I want to redo all my search engines too

@Thorin-Oakenpants

This comment has been minimized.

Show comment
Hide comment
@Thorin-Oakenpants

Thorin-Oakenpants Dec 9, 2017

Member

wait ... https://bugzilla.mozilla.org/show_bug.cgi?id=1405670 .. does this means all my AddToSearchBar search engines will vanish come FF58?

Member

Thorin-Oakenpants commented Dec 9, 2017

wait ... https://bugzilla.mozilla.org/show_bug.cgi?id=1405670 .. does this means all my AddToSearchBar search engines will vanish come FF58?

@atomGit

This comment has been minimized.

Show comment
Hide comment
@atomGit

atomGit Dec 9, 2017

Collaborator

well, if i'm understanding what the XML import/export dev said correctly, that seems to be a possibility but i doubt this will happen - i think it's a bit more complex

so the /searchplugins folder will not be loaded in v58 - what does that mean? as ethically challenged as Moz has become, they're surely not going to prevent 3rd party engines, however, will FF then add the 3rd party engines to the .json file and add their own params regardless of the users choice? i don't know

how will one add 3rd party engines? i don't know - maybe it will be by add-ons only - if FF doesn't load the /searchplugins folder, then i would think there is no way that a user can add an engine other than some 'official' method, such as an add-on

much of this is guess work, so take it for what it's worth

Collaborator

atomGit commented Dec 9, 2017

well, if i'm understanding what the XML import/export dev said correctly, that seems to be a possibility but i doubt this will happen - i think it's a bit more complex

so the /searchplugins folder will not be loaded in v58 - what does that mean? as ethically challenged as Moz has become, they're surely not going to prevent 3rd party engines, however, will FF then add the 3rd party engines to the .json file and add their own params regardless of the users choice? i don't know

how will one add 3rd party engines? i don't know - maybe it will be by add-ons only - if FF doesn't load the /searchplugins folder, then i would think there is no way that a user can add an engine other than some 'official' method, such as an add-on

much of this is guess work, so take it for what it's worth

@Forsaked

This comment has been minimized.

Show comment
Hide comment
@Forsaked

Forsaked Dec 9, 2017

Collaborator

Or you just generate your own search plug-in with your preferences at www.mycroftproject.com and import them from there, instead from an XML/JSON.

Collaborator

Forsaked commented Dec 9, 2017

Or you just generate your own search plug-in with your preferences at www.mycroftproject.com and import them from there, instead from an XML/JSON.

@atomGit

This comment has been minimized.

Show comment
Hide comment
@atomGit

atomGit Dec 9, 2017

Collaborator

import them how?

Collaborator

atomGit commented Dec 9, 2017

import them how?

@atomGit

This comment has been minimized.

Show comment
Hide comment
@atomGit

atomGit Mar 8, 2018

Collaborator

@earthlng @Thorin-Oakenpants

uBO rules to block what seem to be tracking images on startpage and ixquick:

apparently they are not tracking images - Startpage saw my search engine page where i commented about these 'tracking' pixels - here's what they said:

Startpage: BTW StartPage/Ixquick do not use tracking images. What you noted are non-tracking clear GIFs. Here’s a KB article about that.

Me: regarding the 1×1 gif images, i don’t understand how an image can be used to prevent a 3rd party from setting a cookie – can you explain?

Startpage: We have a proxy service that lets you view a result anonymously (by clicking Proxy near a result). When you view a webpage this way, our servers load the page on your behalf, and then provide the content to you. That way the website you are viewing won’t see you. Their website content is served through our domain. Webpages have many ways to set cookies – through Javascript and otherwise. When we proxy the webpage on your behalf, we take many steps to prevent them from doing so. (If they did successfully set a cookie, the cookie would be stored on our domain.) To add extra protection, we then display this extra 1×1 image from our domain that includes cookie headers to clear any such cookies. That way, if any external website you viewed through our proxy manages to set a cookie on our proxy’s domain, we immediately clear that cookie.

Me: why several 1×1 images are used – why not just 1?

Startpage: It is simpler to offer a different image for each different aggregate count we are keeping.

Me: why do the file names appear to contain a UIN that changes with every search apparently?

Startpage: There is no identifier. Rather, there is something called an “anticache” parameter that has a random number. This prevents the image from being “cached” by the browser – as browser caching would prevent the loading – hence would prevent the aggregate counts from being correct.

Me: why are these clear gif’s are not loaded when 0 results are returned?

Startpage: A different part of the code is used when there are no results, so it might not include the same aggregate counts.

Collaborator

atomGit commented Mar 8, 2018

@earthlng @Thorin-Oakenpants

uBO rules to block what seem to be tracking images on startpage and ixquick:

apparently they are not tracking images - Startpage saw my search engine page where i commented about these 'tracking' pixels - here's what they said:

Startpage: BTW StartPage/Ixquick do not use tracking images. What you noted are non-tracking clear GIFs. Here’s a KB article about that.

Me: regarding the 1×1 gif images, i don’t understand how an image can be used to prevent a 3rd party from setting a cookie – can you explain?

Startpage: We have a proxy service that lets you view a result anonymously (by clicking Proxy near a result). When you view a webpage this way, our servers load the page on your behalf, and then provide the content to you. That way the website you are viewing won’t see you. Their website content is served through our domain. Webpages have many ways to set cookies – through Javascript and otherwise. When we proxy the webpage on your behalf, we take many steps to prevent them from doing so. (If they did successfully set a cookie, the cookie would be stored on our domain.) To add extra protection, we then display this extra 1×1 image from our domain that includes cookie headers to clear any such cookies. That way, if any external website you viewed through our proxy manages to set a cookie on our proxy’s domain, we immediately clear that cookie.

Me: why several 1×1 images are used – why not just 1?

Startpage: It is simpler to offer a different image for each different aggregate count we are keeping.

Me: why do the file names appear to contain a UIN that changes with every search apparently?

Startpage: There is no identifier. Rather, there is something called an “anticache” parameter that has a random number. This prevents the image from being “cached” by the browser – as browser caching would prevent the loading – hence would prevent the aggregate counts from being correct.

Me: why are these clear gif’s are not loaded when 0 results are returned?

Startpage: A different part of the code is used when there are no results, so it might not include the same aggregate counts.

@earthlng

This comment has been minimized.

Show comment
Hide comment
@earthlng

earthlng Mar 9, 2018

Member

Thanks. I wrote "what seem to be tracking images" for a reason but somehow that got lost when Pants added it to the wiki.

"aggregate count" ??

why can't they clear the cookie with the main document? 5+ image requests to clear a cookie? IDK

Member

earthlng commented Mar 9, 2018

Thanks. I wrote "what seem to be tracking images" for a reason but somehow that got lost when Pants added it to the wiki.

"aggregate count" ??

why can't they clear the cookie with the main document? 5+ image requests to clear a cookie? IDK

@earthlng

This comment has been minimized.

Show comment
Hide comment
@earthlng

earthlng Mar 9, 2018

Member

btw this 1 is missing in the wiki:
||/tst2/*$image,important,domain=ixquick.com|ixquick.eu|startpage.com.

The EasyPrivacy list also detects and blocks elt.gif. All of this just to clear a cookie? IDK man

Member

earthlng commented Mar 9, 2018

btw this 1 is missing in the wiki:
||/tst2/*$image,important,domain=ixquick.com|ixquick.eu|startpage.com.

The EasyPrivacy list also detects and blocks elt.gif. All of this just to clear a cookie? IDK man

@Thorin-Oakenpants

This comment has been minimized.

Show comment
Hide comment
@Thorin-Oakenpants

Thorin-Oakenpants Mar 9, 2018

Member

if you want, please update the wiki - add missing line, add ref to https://support.startpage.com/index.php?/Knowledgebase/Article/View/260/0/why-is-startpage-loading-1x1-gifs-clear-pixel-images-when-i-search or whatever, add "seems to be" ... close when and if you get around to it

Member

Thorin-Oakenpants commented Mar 9, 2018

if you want, please update the wiki - add missing line, add ref to https://support.startpage.com/index.php?/Knowledgebase/Article/View/260/0/why-is-startpage-loading-1x1-gifs-clear-pixel-images-when-i-search or whatever, add "seems to be" ... close when and if you get around to it

@Thorin-Oakenpants

This comment has been minimized.

Show comment
Hide comment
@Thorin-Oakenpants

Thorin-Oakenpants Apr 21, 2018

Member

done

PS: thanks @atomGit for the info

Member

Thorin-Oakenpants commented Apr 21, 2018

done

PS: thanks @atomGit for the info

@h1z1

This comment has been minimized.

Show comment
Hide comment
@h1z1

h1z1 Apr 22, 2018

@atomGit :

Rather, there is something called an “anticache” parameter that has a random number. This prevents the image from being “cached” by the browser – as browser caching would prevent the loading – hence would prevent the aggregate counts from being correct.

What? Why not use proper caching headers, since they mitm users anyway? What is the expire time on these images?

h1z1 commented Apr 22, 2018

@atomGit :

Rather, there is something called an “anticache” parameter that has a random number. This prevents the image from being “cached” by the browser – as browser caching would prevent the loading – hence would prevent the aggregate counts from being correct.

What? Why not use proper caching headers, since they mitm users anyway? What is the expire time on these images?

@atomGit

This comment has been minimized.

Show comment
Hide comment
@atomGit

atomGit Apr 23, 2018

Collaborator

i'm too stupid to answer your question - all i can provide is what they told me - it sounds to me like Startpage is an ethical bunch, but i can't be sure of that - i also did not fully comprehend their explanation for these random anti-cache strings, however maybe we have to consider that the service they're providing is a bit unique in that they are acting as a proxy and so there may be technical considerations which may be beyond the norm

Collaborator

atomGit commented Apr 23, 2018

i'm too stupid to answer your question - all i can provide is what they told me - it sounds to me like Startpage is an ethical bunch, but i can't be sure of that - i also did not fully comprehend their explanation for these random anti-cache strings, however maybe we have to consider that the service they're providing is a bit unique in that they are acting as a proxy and so there may be technical considerations which may be beyond the norm

@h1z1

This comment has been minimized.

Show comment
Hide comment
@h1z1

h1z1 Apr 23, 2018

Been a while since I used / looked at stargepage. Taking a peek, they are indeed doing some rather silly things - elt.gif is one. Short version is though they set a cache policy (3456000 seconds), by setting random nonce they effectively negate it. Worse it actually allows them to snoop the cache later AND wastes resources by pointlessly filling the browsers cache.

Whether or not they do or have ever done that since it would amount to some of the very tracking they are claiming to prevent, is a different matter.

h1z1 commented Apr 23, 2018

Been a while since I used / looked at stargepage. Taking a peek, they are indeed doing some rather silly things - elt.gif is one. Short version is though they set a cache policy (3456000 seconds), by setting random nonce they effectively negate it. Worse it actually allows them to snoop the cache later AND wastes resources by pointlessly filling the browsers cache.

Whether or not they do or have ever done that since it would amount to some of the very tracking they are claiming to prevent, is a different matter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment