Can't use googleapis as ( now terminates a URL #1054

gchiu · 2020-02-28T06:53:28Z

>> https://www.youtube.com?test=(snippet=1)
** Script Error: snippet=1 is VOID!
** Where: console
** Near: [snippet=1 ~~]
** Line: 1

The text was updated successfully, but these errors were encountered:

Mark-hi · 2020-02-28T12:56:13Z

You are aware that Google Chrome shows that URL as https://www.youtube.com/?test=%28snippet%3D1%29, right?

hostilefork · 2020-02-28T14:05:38Z

I've said before and will say again, that we need a single location that pulls together all the thinking on rules for URL!. I cannot hold it all in my head, and I need an overview and design writeup that explains the history so far, the desired properties, etc. etc.

Problems come up so frequently that it's clear people are passionate about it, but not quite passionate enough to write a start-to-finish manifesto that covers ALL the bases. Folks seem to spot the thing that's bothering them that day and ask for it to be addressed in a way that helps with the inconvenience they're having right now. But what we can tell about this situation is that a piecemeal solution will give piecemeal results...that are ultimately unsatisfactory and require constant revisitation.

This issue and the last one have been about the legality of unescaped non-ASCII delimiters, which seems a messy topic:

#1046

Other things that I can pull to mind about this issue are:

Rebol's intent for the URL!/URI! type is broader than just http://, which may not be beholden to the rules of the RFCs for a http style URL!. We should be cautious of building in automatic behaviors at the source level--for instance, rules that dictate how the scanner treats % sequences--because foo:// might have entirely different ideas about %.
@rgchris is adamant about the value of direct paste-from-browser. But modern browsers present many non-ASCII alphanumeric characters literally (such as accented characters), even though the wire format of URLs requires them to be escaped as UTF-8 bytes. The behavior is inconsistent across browsers or browser versions...each encoding a logic of how they interpret URLs that have mixtures of unescaped characters along with percent-encoded escape sequences.
To deal with the inherent ambiguity in browsers and whether they have run through an escaping or not, I've suggested we enforce the idea that percents that appear in URL literals are always held to the idea that they are followed by hex characters representing a valid UTF-8 sequence. As I mentioned above, this should likely be a rule implemented in the URL! handling specific to http:// and not a scanner rule forbidding other URL!s from being accepted at LOAD-time.
I've suggested that internally to the mechanics for FILE! and URL!, once an API has gotten to the level that it understands it is dealing with what must be a file or url, that TEXT! be used at that point. This allows the mechanical layer to use being TEXT! as an indicator that escaping has already happened, to avoid double-escaping (or unescaping). e.g. all URL!s are assumed to be in a mixed form which allows escaping with tolerance of unescaped characters to give some wiggle room...where the already escaped sequences won't be double-escaped during the translation to TEXT!.

(Note: The above mechanic is implemented in FILE-TO-LOCAL and LOCAL-TO-FILE, I don't know how many people have noticed...but it won't let you say LOCAL-TO-FILE on a FILE! by default...)

There are some parts. But we need a whole design that can be executed on in the codebase. I don't even really have much feedback about how the modification @rgchris requested of treating URLs "as-is" has been working out or not...but we have been running with it for a while.

We also need complete tests as a spec for what the code should do. If Red has tests for URL scanning then someone should dig those up as well, to examine for reasoning and differences.

gchiu · 2020-02-28T21:09:45Z

Ah, it looks like I have to encode both ( and ) for https://www.googleapis.com/youtube/v3/videos?part=snippet&id={YOUTUBE_VIDEO_ID}&fields=items(id%2Csnippet)&key={YOUR_API_KEY} otherwise rebol barfs with ** Syntax Error: invalid "word" -- "id%2Csnippet" ** Where: transcode if load trap ext-console-impl entrap console ** Near: (line 1) https://www.googleapis.com/youtube/v3/videos?part=snippet&id={YOUTUBE_VIDEO_ID}&fields=items(id%2Csnippet)&key={YOUR_API_KEY}

…

On Sat, 29 Feb 2020 at 01:56, Mark-hi ***@***.***> wrote: You are aware that Google Chrome shows that URL as https://www.youtube.com/?test=%28snippet%3D1%29, right? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1054?email_source=notifications&email_token=AABR4QSO35YVUXM7GIXCETTRFECW3A5CNFSM4K5KG7O2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENINRSI#issuecomment-592500937>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABR4QX5JPDXOZH5TK3GO4DRFECW3ANCNFSM4K5KG7OQ> .

-- Graham Chiu

hostilefork assigned gchiu, rgchris and IngoHohmann Feb 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't use googleapis as ( now terminates a URL #1054

Can't use googleapis as ( now terminates a URL #1054

gchiu commented Feb 28, 2020

Mark-hi commented Feb 28, 2020

hostilefork commented Feb 28, 2020 •

edited

gchiu commented Feb 28, 2020 via email

Can't use googleapis as ( now terminates a URL #1054

Can't use googleapis as ( now terminates a URL #1054

Comments

gchiu commented Feb 28, 2020

Mark-hi commented Feb 28, 2020

hostilefork commented Feb 28, 2020 • edited

gchiu commented Feb 28, 2020 via email

hostilefork commented Feb 28, 2020 •

edited