Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getgooglequery does not work correctly with Google Forms links #2848

Closed
jakubklimek opened this issue Sep 5, 2019 · 2 comments · Fixed by #2891
Closed

getgooglequery does not work correctly with Google Forms links #2848

jakubklimek opened this issue Sep 5, 2019 · 2 comments · Fixed by #2891
Labels

Comments

@jakubklimek
Copy link

Description

When I access a dokuwiki page on some dokuwikis and the HTTP request has referer HTTP header set to, e.g. referer: https://www.google.com/url?q=https://opendata.gov.cz/edu:konference:2019&sa=D&ust=1567692046621000&usg=AFQjCNG_0v_FF28O_RTQNJ0RZqqp49fGig - this is what Google Forms does when a link to https://opendata.gov.cz/edu:konference:2019 is used - the page has words from the https://opendata.gov.cz/edu:konference:2019 URL marked as search results, even though I did no search.

Apparently (giterlizzi/dokuwiki-template-bootstrap3#438 (comment)) this is a Dokuwiki getgooglequery functionality for highlighting search results from search engines, which, however, does not work correctly when placing a link into a Google Form.

Steps to reproduce

  1. See this example, or this one: curl https://wiki.bash-hackers.org/ -H 'referer: https://www.google.com/url?q=https://opendata.wiki.gov.cz/edu:konference:2019&sa=D&ust=1567692046621000&usg=AFQjCNG_0v_FF28O_RTQNJ0RZqqp49fGig' | grep mark

Expected behavior: [What you expected to happen]
No search results marked with <span class="mark">

Actual behavior: [What actually happened]

Words from the referer header marked as <span class="mark">

Versions

[DokuWiki] Release 2018-04-22b "Greebo"

Screenshots or Logs

[Paste your logs or attach the screenshot]
image

@phy25
Copy link
Collaborator

phy25 commented Sep 8, 2019

This one is interesting. I don't think we can get user query from referrer anymore from Google, if users are being redirected from the Google /url endpoint.

One example: https://www.google.com/url?sa=t&source=web&rct=j&url=https://www.dokuwiki.org/&ved=2a-long-string-that-may-include-my-search-info

Maybe we should ignore it if it's from google.com/url? I am not that familiar with how Google's recent referrer works, so we need more input.

@phy25
Copy link
Collaborator

phy25 commented Oct 20, 2019

We can:

  1. Remove Google (since they never include query in url)
  2. or drop the query if query is a URL (we can just check if // is included)

I decided to implement 2.

phy25 added a commit that referenced this issue Oct 20, 2019
We don't want to split the URL to highlight the "query", especially when q is the URL of the page itself - e.g. Google Form's redirect https://www.google.com/url

This will also ignore queries like `syntax site:https://www.dokuwiki.org` but it should be fine. Just don't want to use a full parser here.

This fixes #2848.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants