Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

more search engines treated as external sites #2586

Closed
anonymous-piwik-user opened this Issue · 17 comments

4 participants

@anonymous-piwik-user

Search engines wrongly considered external websites:

www.google.it/imgres?q=beato+vincenzo+romano&rlz=1W1ADSA_it&biw=1280&bih=556&tbm=isch&tbnid=GoEpBEHlKndZFM:&imgrefurl=h t t p://it.cathopedia.org/wiki/Beato_Vincenzo_Romano&docid=lwRZRPTQtuqNeM&w=200&h=204&ei=1_YqTsquFdOt8QPT7KGUDA&zoom=1&iact=rc&dur=78&page=1&tbnh=137&tbnw=127&start=0 (please remove spaces, I needed to insert them in order to avoid the bug be treated as spam)

search.findeer.com/cafarnaum looks for cafarnaum

http://scour.com/search/web/cafarnaum looks for cafarnaum

www.google.com/imgres?imgurl=http://it.cathopedia.org/w/images/b/ba/Zeffirelli_Ges%C3%B9_Nazareth_002.jpg&imgrefurl=h t t p://it.cathopedia.org/wiki/Ges%25C3%25B9_di_Nazareth_%28sceneggiato_televisivo%29&usg=__2nXAIxZzstkylCKLMYT6zyOU2BQ=&h=261&w=193&sz=12&hl=it&start=28&zoom=1&tbnid=BQIQJYLidRDj1M:&tbnh=127&tbnw=92&ei=YGgbTuaULdPA8QPBt_kW&itbs=1&iact=hc&vpx=746&vpy=90&dur=289&hovh=208&hovw=154&tx=114&ty=106&page=2&ndsp=29&ved=1t:429,r:4,s:28&biw=1280&bih=713 is a result of a google images search (please remove spaces, I needed to insert them in order to avoid the bug be treated as spam)

www.google.ch/imgres?q=gerico+giordania+il+paese&hl=it&safe=off&sa=G&biw=1024&bih=459&gbv=2&tbm=isch&tbnid=1RbnTzuwQ5xLdM:&imgrefurl=http://it.cathopedia.org/wiki/Gerico&docid=7h1k1qwY74vIWM&w=300&h=225&ei=QGApTtGcA8fEsgaFnKHnCw&zoom=1&iact=rc&dur=78&page=10&tbnh=135&tbnw=179&start=77&ndsp=9&ved=1t:429,r:3,s:77&tx=106&ty=71

www.google.it./m?q=search

www.url.org/?g=search&ft=2&f=1&l=it&a=1&r=dizionario+lingua+italiana+zanichelli&s=go&c=001&t=001&d=40737&q=dizionario+zanichelli&qs=presbitero

buscador.terra.cl/Results.aspx?source=Search the url doesn't have the search term, which is supposedly passed as a post type one

eu.ixquick.com like the preceeding

int.search-results.com/web?qsrc=1&o=16537&l=dis&atb=sysid%3D1%3Aappid%3D393%3Auid%3Dd15f554585901fc7%3Auc%3D1309549109%3Asrc%3Dhmp%3Ao%3D16537%3Aq%3Dbibbia%2520preghiera%2520padre%2520nostro&q=bibbia+cattolica+preghiera+padre+nostro

mundo.busca.uol.com.br/buscar.html?q=teologia+della+pazienza&ad=on

search.juno.com/search?action=search&source=nextpage&query=Nostra+Signora+di+Lourdes%2C+Tor+Marancia%2C+Roma&start=30&adpage=4&orig_source=startpage-free-c

searchservices.verizon.com/search/ws.portal?_nfpb=true&_pageLabel=google_results&q=Storie+di+Locri+E+Gerace&channel=MyVrzn&clientid=Cnsmr&PAGING_QUERY=true&start=10&pagenum=2

start.flashvideodownloader.org/result.php?q=congregazion+eper+il+clero+email%3A+&ie=UTF-8&cx=partner-pub-5087362176467115:lyglkqaff6i&cof=FORID:10&sa=Search#1311442684842&nq=congregazion%20eper%20il%20clero%20email%3A

www.metacrawler.com/info.metac.test.b8/search/web?fcoid=417&fcop=topnav&fpid=27&q=chiesa+catolica+in+egitto

www.webmii.es/Result.aspx?f=Claudio&l=Doglio&r=es

www.yatedo.fr/search/profil?q=Eucologia&btn_s=&c=all

@mattab
Owner

Thanks for the suggestion, but would it be possible for you to propose the patch to track these search engines as explained in: http://piwik.org/faq/general/#faq_39
this would greatly help :) thanks!

@anonymous-piwik-user

I've sent some of the urls via email as in http://piwik.org/faq/general/#faq_39

Others url are supposingly needing two parameters. How do you manage them?

@sgiehl
Collaborator

(In [5180]) refs #2586 added search.juno.com to google powered search engines

@sgiehl
Collaborator

(In [5181]) refs #2586 added search engine www.url.org

@sgiehl
Collaborator

(In [5182]) refs #2586 added searchresults.verizon.com to google powered search engines

@sgiehl
Collaborator

(In [5183]) refs #2586 added search engine scour.com

@mattab
Owner

Please submit the patch for the remaining missing search engines (as per faq) thanks for your feedback!

@anonymous-piwik-user

"www.google.it" => array( "google", "q", "m?q={k}"),

"mundo.busca.uol.com.br" => array( "mundo.busca.uol", "q", "buscar.html?q={k}"),

"www.yatedo.fr" => array( "yatedo", "q", "search/profil?q={k}"),

"www.metacrawler.com" => array( "metacrawler", "q", "info.metac.test.b8/search/web?q={k}"),

"int.search-results.com" => array( "search-result", "q", "web?q={k}"),

@anonymous-piwik-user

can be reopened

@sgiehl
Collaborator

Ok. Let's have a look at the suggested entries:

  • google.it should already be discovered by the exisiting rules
@sgiehl
Collaborator

(In [5649]) refs #2586 fixed detection of infospace powered search engines like metacrrawler

@sgiehl
Collaborator

(In [5650]) refs #2586 added search engine busco.uol.com.br

@sgiehl
Collaborator

(In [5651]) refs #2586 added missing int.search-results.com

@sgiehl
Collaborator

(In [5652]) refs #2586 added yatedo.com / yatedo.fr

@robocoder

(In [5655]) refs #2586 - fix build

@mattab
Owner

Thanks paolobenve & Steve, always good to keep this list up to date, it makes Piwik look sharper!

@sgiehl
Collaborator

I'm closing this ticket now. The remaning urls aren't search engines, or we aren't able to determine the keyword.

@anonymous-piwik-user anonymous-piwik-user added this to the 1.7 Piwik 1.7 milestone
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.