-
-
Notifications
You must be signed in to change notification settings - Fork 7
Match links with multiple hash (#) characters correctly #102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The changes LGTM. But as far as I remember, I implemented link parsing exactly like the RFC stated. So I wonder if it's a non standard but common case or if I misimplemented this part. |
|
I just checked the RFC. I can't find out why https://matrix.to/#/#deltachat:matrix.org is a valid link. Care to give some hints? Edit: Or maybe it's a non standard but common thing? |
|
Found this w3c email thread from 2008 that is unresolved regarding this issue https://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2008Nov/0044.html Not much help, but it seems like a harsh treatment (only one hash per fragment) would break many links that curently in use in modern applications, like our test case here. |
|
So common but non standard. It makes sense we support these links. But we have to make a spec of our own or use an existing one. And then implement our logic according to that spec. Do you know a widely accepted spec for such these links? Or do you have more examples of such these? We could also see how other applications parse it. |
|
IMO we can just document it in spec.md and source code that it "follows the following specs + also multiple hashtags to make matrix links work" no need to search for a specific spec that includes this - for now it is sufficient if we just document that we support it, we can always improve it further in the future. |
|
Found something interesting in this RFC3986 which seems to also define that the fragment. Basically I read this as "it's up to the answering server". The last line saying that it will not be redefined by later URI scheme also suggests we're fine allow basically anything into the fragment https://www.rfc-editor.org/rfc/rfc3986#section-3.5 I'll update the spec to include a note about this, as well as documenting the simple link without protocol scheme that we've already merged from my other branch. |
…ple links without schema when they match allow-list TLDs
986f5ad to
f4285db
Compare
Simon-Laux
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code LGTM, didn't test
|
If you are not in a hurry, I will give my review before Monday, God willing. |
|
after this is merged we can make a new release and update it in desktop |
farooqkz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Should fix #101 Added test case from issue, this makes the parser more greedy in matching everything after first hash character as being part of the ifragment (see https://www.rfc-editor.org/rfc/rfc3987 page 7).
This is my first contribution in rust and for deltachat, happy to receive feedback and criticism to help me improve 😄