Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't use googleapis as ( now terminates a URL #1054

Open
gchiu opened this issue Feb 28, 2020 · 3 comments
Open

Can't use googleapis as ( now terminates a URL #1054

gchiu opened this issue Feb 28, 2020 · 3 comments
Assignees

Comments

@gchiu
Copy link

gchiu commented Feb 28, 2020

>> https://www.youtube.com?test=(snippet=1)
** Script Error: snippet=1 is VOID!
** Where: console
** Near: [snippet=1 ~~]
** Line: 1
@Mark-hi
Copy link

Mark-hi commented Feb 28, 2020

You are aware that Google Chrome shows that URL as https://www.youtube.com/?test=%28snippet%3D1%29, right?

@hostilefork
Copy link
Member

hostilefork commented Feb 28, 2020

I've said before and will say again, that we need a single location that pulls together all the thinking on rules for URL!. I cannot hold it all in my head, and I need an overview and design writeup that explains the history so far, the desired properties, etc. etc.

Problems come up so frequently that it's clear people are passionate about it, but not quite passionate enough to write a start-to-finish manifesto that covers ALL the bases. Folks seem to spot the thing that's bothering them that day and ask for it to be addressed in a way that helps with the inconvenience they're having right now. But what we can tell about this situation is that a piecemeal solution will give piecemeal results...that are ultimately unsatisfactory and require constant revisitation.

This issue and the last one have been about the legality of unescaped non-ASCII delimiters, which seems a messy topic:

#1046

Other things that I can pull to mind about this issue are:

  • Rebol's intent for the URL!/URI! type is broader than just http://, which may not be beholden to the rules of the RFCs for a http style URL!. We should be cautious of building in automatic behaviors at the source level--for instance, rules that dictate how the scanner treats % sequences--because foo:// might have entirely different ideas about %.

  • @rgchris is adamant about the value of direct paste-from-browser. But modern browsers present many non-ASCII alphanumeric characters literally (such as accented characters), even though the wire format of URLs requires them to be escaped as UTF-8 bytes. The behavior is inconsistent across browsers or browser versions...each encoding a logic of how they interpret URLs that have mixtures of unescaped characters along with percent-encoded escape sequences.

  • To deal with the inherent ambiguity in browsers and whether they have run through an escaping or not, I've suggested we enforce the idea that percents that appear in URL literals are always held to the idea that they are followed by hex characters representing a valid UTF-8 sequence. As I mentioned above, this should likely be a rule implemented in the URL! handling specific to http:// and not a scanner rule forbidding other URL!s from being accepted at LOAD-time.

  • I've suggested that internally to the mechanics for FILE! and URL!, once an API has gotten to the level that it understands it is dealing with what must be a file or url, that TEXT! be used at that point. This allows the mechanical layer to use being TEXT! as an indicator that escaping has already happened, to avoid double-escaping (or unescaping). e.g. all URL!s are assumed to be in a mixed form which allows escaping with tolerance of unescaped characters to give some wiggle room...where the already escaped sequences won't be double-escaped during the translation to TEXT!.

(Note: The above mechanic is implemented in FILE-TO-LOCAL and LOCAL-TO-FILE, I don't know how many people have noticed...but it won't let you say LOCAL-TO-FILE on a FILE! by default...)

There are some parts. But we need a whole design that can be executed on in the codebase. I don't even really have much feedback about how the modification @rgchris requested of treating URLs "as-is" has been working out or not...but we have been running with it for a while.

We also need complete tests as a spec for what the code should do. If Red has tests for URL scanning then someone should dig those up as well, to examine for reasoning and differences.

@gchiu
Copy link
Author

gchiu commented Feb 28, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants