Trying to specify `doc-open-url` results in a useless F1 webpage #1798

blerner · 2017-09-06T22:58:53Z

In the past, because our lab machines have DrRacket installed virtually, we've been using a hack to redirect the F1 help-search page. @elibarzilay revised that hack into a supported mechanism, so now we can simply say

raco pkg config --set -i doc-open-url http://docs.racket-lang.org/

and F1 should simply work. (It used to be a different URL, http://download.racket-lang.org/docs/<version>/html/, apparently, but I'm guessing this is the new correct one...) Trouble is, if you actually point DrRacket at this URL, you get no results, because the context is set to, e.g. Intermediate+Student+Language -- note the plusses. @stchang and I tracked this behavior down today, and we unraveled the following:

/usr/share/racket/pkgs/scribble-lib/help/search.rkt (which I can't find in the repo, so I can't link to it directly) defines perform-search, which builds up a uri-encoded query string containing the context and the search term.
It then passes that string into send-main-page, which runs (url-query (string->uri (format "q?~a" query))), to parse that string back into a url structure, then peel off the query portion of it as an alist.
It appends that alist with any existing query from the doc-open-url, and builds up a new url which it then converts to a string via url->string and sends over the network.

Now for the tricky bits:

The initial context includes things like L: Intermediate Student Language, with spaces. Those spaces get uri-encoded to %20.
The various components of the query string get concatenated with +.
Then string->uri decodes the whole thing back to an alist whose entries are strings with spaces.
Then the final url->string encodes the query string using alist->form-urlencoded which turns the spaces in the strings into +, and then concatenates the various query components with & (I think; it might be ;).
Finally, the server seems to choose not to translate the + back to spaces, despite the warning in the code (uri-codec.rkt):

;; NB: RFC 2396 supersedes RFC 1738.

;; This differs slightly from the straight encoding in RFC 2396 in
;; that `+' is allowed, and represents a space.  We follow this
;; convention, encoding a space as `+' and decoding `+' as a space.
;; There appear to be some brain-dead decoders on the web, so we also
;; encode `!', `~', `'', `(' and ')' using their hex representation.
;; This is the same choice as made by the Java URLEncoder.

;; Draws inspiration from encode-decode.scm by Kurt Normark and a code
;; sample provided by Eli Barzilay

Presumably, this is because identifiers like scene+line need their pluses intact. But it leads to a broken context on the search page, and no results get found.

So it seems like one or more of the following things are going wrong here:

send-main-page shouldn't tear apart its query string, or it should do it properly, or something
url-string shouldn't be using alist->form-urlencoded, or alist->form-urlencoded shouldn't use form-url-encode, to handle spaces in the provided arguments
And, the server needs to correctly handle + characters in its query strings and/or it needs to conspire better with send-main-page to encode +s in Racket identifier names so that +s in query strings can be unambiguously decoded.

Note: This doesn't affect local installations of DrRacket, because those send F1 pages to a local file:// URL, and that control path doesn't muck with the query string at all.

I don't think this can be fixed purely server-side, since it affects code in DrRacket too, so this fix won't make it into 6.10. But if there's a patch we can try, we can probably hack that into our deployment and try it out...

The text was updated successfully, but these errors were encountered:

stchang · 2017-09-06T23:09:55Z

Here's the start of the referenced code: https://github.com/racket/scribble/blob/master/scribble-lib/help/search.rkt#L63

mflatt · 2017-09-08T18:37:54Z

As far as I can tell, the part that's going wrong is on the server side, although it's JavaScript sent back to the browser. The URL is parsed using string operations, and a "label" query is converted using unescape in JavaScript, which does not decode + into a space.

To avoid the immediate problem, I see advice like
https://stackoverflow.com/questions/12042592/decoding-url-parameters-with-javascript,
but surely that's not the right approach.

The deeper problem is that the JavaScript code shouldn't try to do its own parsing of a URL string. The obviously better choice is to use the URL class in JavaScript... but apparently URL is new and not always supported by browsers???

So, it seems like a stupid question, but: How do I correctly and portably parse a URL in JavaScript to extract a query value?

samth · 2017-09-08T18:55:31Z

This suggests that you can use URL with high probability: http://caniuse.com/#feat=url

samth · 2017-09-08T18:57:31Z

And if you want a backup solution for IE, here's a polyfill: https://github.com/webcomponents/URL

jackfirth · 2017-09-08T19:12:51Z

Part of the cause here is that the HTTP URI scheme RFC does not standardize any format whatsoever for the query string; it can basically be any blob of characters. The foo=1&bar=2 syntax is purely by convention of (nearly all) implementations. Many servers and clients have slightly different interpretations and formatting rules with no clear indication which set of rules is "correct".

mflatt · 2017-09-08T20:58:00Z

@blerner I've updated docs.racket-lang.org. Does using "F1" in DrRacket now work?

blerner · 2017-09-08T21:10:50Z

Looks like it helps: F1 on scene+line yields the URL http://docs.racket-lang.org/search/search-context.html?q=scene%2Bline&hq=O%3A%7B+L%3Alang%2Fhtdp-beginner+T%3Ateachpack+T%3Apicturing-programs+%7D&label=Beginning+Student which then seems to work properly -- at least, on Linux, in Chromium and in Firefox. I can't test on the lab machines right now, because I don't have access to changing their config myself, but it should suffice...

Thanks!

Use the `URL` class in JavaScript instead of manually parsing a URL string. Related to #1798

blerner · 2017-09-13T19:26:08Z

Ah, it's not quite fixed. The resulting page doesn't work in IE. The relevant JS code is

if (location.search.length > 0) {
  var u = new URL(location);
  var newsearch = "";
  for(var key of u.searchParams.keys()) { // <<<=== this line
    var val = u.searchParams.get(key);
    // an empty "hq=" can be used to clear the cookie
    if (key == "hq") {
      SetCookie("PLT_ContextQuery", val);
    } else if (key == "label") {
      SetCookie("PLT_ContextQueryLabel", val);
    } else {
      if (newsearch == "") newsearch = "?";
      newsearch = newsearch + "&" + key + "=" + encodeURIComponent(val);
    }
  }
  // localtion.replace => jump without leaving the current page in the history
  // (the new url uses "index.html" and the new search part)
  location.replace(location.href.replace(/\/[^\/?#]*[?][^#]*/,
                                         "/index.html" + newsearch));
} else {
  // no parameters found? just jump to the search page...
  location.href = "index.html";
}

I'm pretty sure IE doesn't support for-of loops. You'd need to use the wordier for-in loop with a hasOwnProperty check...

mflatt added a commit that referenced this issue Sep 10, 2017

fix URL parsing in search-in-context browser trampoline

c8ecb5e

Use the `URL` class in JavaScript instead of manually parsing a URL string. Related to #1798

blerner mentioned this issue Sep 13, 2017

Right-click "View documentation for ..." doesn't redirect via doc-open-url like F1 does #1802

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trying to specify `doc-open-url` results in a useless F1 webpage #1798

Trying to specify `doc-open-url` results in a useless F1 webpage #1798

blerner commented Sep 6, 2017 •

edited

stchang commented Sep 6, 2017

mflatt commented Sep 8, 2017

samth commented Sep 8, 2017

samth commented Sep 8, 2017

jackfirth commented Sep 8, 2017 •

edited

mflatt commented Sep 8, 2017

blerner commented Sep 8, 2017

blerner commented Sep 13, 2017

Trying to specify doc-open-url results in a useless F1 webpage #1798

Trying to specify doc-open-url results in a useless F1 webpage #1798

Comments

blerner commented Sep 6, 2017 • edited

stchang commented Sep 6, 2017

mflatt commented Sep 8, 2017

samth commented Sep 8, 2017

samth commented Sep 8, 2017

jackfirth commented Sep 8, 2017 • edited

mflatt commented Sep 8, 2017

blerner commented Sep 8, 2017

blerner commented Sep 13, 2017

Trying to specify `doc-open-url` results in a useless F1 webpage #1798

Trying to specify `doc-open-url` results in a useless F1 webpage #1798

blerner commented Sep 6, 2017 •

edited

jackfirth commented Sep 8, 2017 •

edited