Skip to content

Conversation

@jackaperkins
Copy link
Contributor

Should fix #101 Added test case from issue, this makes the parser more greedy in matching everything after first hash character as being part of the ifragment (see https://www.rfc-editor.org/rfc/rfc3987 page 7).

This is my first contribution in rust and for deltachat, happy to receive feedback and criticism to help me improve 😄

@Simon-Laux Simon-Laux requested a review from farooqkz May 17, 2025 17:42
@farooqkz
Copy link
Collaborator

The changes LGTM. But as far as I remember, I implemented link parsing exactly like the RFC stated. So I wonder if it's a non standard but common case or if I misimplemented this part.

@farooqkz
Copy link
Collaborator

farooqkz commented May 19, 2025

I just checked the RFC. I can't find out why https://matrix.to/#/#deltachat:matrix.org is a valid link. Care to give some hints?

Edit: Or maybe it's a non standard but common thing?

@jackaperkins
Copy link
Contributor Author

Found this w3c email thread from 2008 that is unresolved regarding this issue https://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2008Nov/0044.html Not much help, but it seems like a harsh treatment (only one hash per fragment) would break many links that curently in use in modern applications, like our test case here.

@farooqkz
Copy link
Collaborator

So common but non standard. It makes sense we support these links. But we have to make a spec of our own or use an existing one. And then implement our logic according to that spec. Do you know a widely accepted spec for such these links? Or do you have more examples of such these?

We could also see how other applications parse it.

@Simon-Laux
Copy link
Member

Simon-Laux commented May 20, 2025

IMO we can just document it in spec.md and source code that it "follows the following specs + also multiple hashtags to make matrix links work"

no need to search for a specific spec that includes this - for now it is sufficient if we just document that we support it, we can always improve it further in the future.

@jackaperkins
Copy link
Contributor Author

Found something interesting in this RFC3986 which seems to also define that the fragment. Basically I read this as "it's up to the answering server". The last line saying that it will not be redefined by later URI scheme also suggests we're fine allow basically anything into the fragment

https://www.rfc-editor.org/rfc/rfc3986#section-3.5

The semantics of a fragment identifier are defined by the set of
   representations that might result from a retrieval action on the
   primary resource.  The fragment's format and resolution is therefore
   dependent on the media type [[RFC2046](https://www.rfc-editor.org/rfc/rfc2046)] of a potentially retrieved
   representation, even though such a retrieval is only performed if the
   URI is dereferenced.  If no such representation exists, then the
   semantics of the fragment are considered unknown and are effectively
   unconstrained.  Fragment identifier semantics are independent of the
   URI scheme and thus cannot be redefined by scheme specifications.

I'll update the spec to include a note about this, as well as documenting the simple link without protocol scheme that we've already merged from my other branch.

Copy link
Member

@Simon-Laux Simon-Laux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code LGTM, didn't test

@farooqkz
Copy link
Collaborator

If you are not in a hurry, I will give my review before Monday, God willing.

@Simon-Laux
Copy link
Member

after this is merged we can make a new release and update it in desktop

Copy link
Collaborator

@farooqkz farooqkz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Simon-Laux Simon-Laux merged commit d81f932 into deltachat:main May 26, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Links containing colons (ie: matrix.to) are not fully read

3 participants