New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When anything in a URL would contain '@', please encode. #128

Closed
Nebukadneza opened this Issue Mar 16, 2014 · 19 comments

Comments

Projects
None yet
5 participants
@Nebukadneza

Nebukadneza commented Mar 16, 2014

Hello,

i was having a weired issue using davdroid and thunderbird with radicale. It seems that thunderbird/sogo/lighting for some reason appends '@' to the UID field. This leads to radicale sending DavDroid URLs with '@', causing DavDroid to crash.
I have opened a DavDroid issue describing the problem here:
https://github.com/rfc2822/davdroid/issues/198

There, the DavDroid developer told me to ask you, the radicale developers, for help to properly encode an '@' when encountered in the data ;).

Thanks for your great server, Best Regards
-Dario Ernst

@liZe liZe added Bug and removed Bug labels Jul 28, 2014

@liZe liZe added this to the 1.0 milestone Jul 28, 2014

@untitaker

This comment has been minimized.

Show comment
Hide comment
@untitaker

untitaker Aug 12, 2014

Contributor

This is probably the same as #158

Contributor

untitaker commented Aug 12, 2014

This is probably the same as #158

@dmfs

This comment has been minimized.

Show comment
Hide comment
@dmfs

dmfs Sep 1, 2014

Contributor

Please do not blindly encode '@'. There is no rule that says all '@'s need to be encoded in URLs. The specs explicitly allow '@' to appear in paths, see http://tools.ietf.org/html/rfc3986#section-3.3
There is no reason to encode the '@'. If DavDroid crashes upon unencoded '@'s they should fix it in their code.

Contributor

dmfs commented Sep 1, 2014

Please do not blindly encode '@'. There is no rule that says all '@'s need to be encoded in URLs. The specs explicitly allow '@' to appear in paths, see http://tools.ietf.org/html/rfc3986#section-3.3
There is no reason to encode the '@'. If DavDroid crashes upon unencoded '@'s they should fix it in their code.

@liZe

This comment has been minimized.

Show comment
Hide comment
@liZe

liZe Oct 2, 2014

Member

Duplicate of #158. Thanks a lot for the explaination @unittaker and @dmfs. As explained in RFC3986 (updating 1738), '@' is explicitely allowed in paths, there's no need to fix anything in Radicale.

Member

liZe commented Oct 2, 2014

Duplicate of #158. Thanks a lot for the explaination @unittaker and @dmfs. As explained in RFC3986 (updating 1738), '@' is explicitely allowed in paths, there's no need to fix anything in Radicale.

@liZe liZe closed this Oct 2, 2014

@dmfs

This comment has been minimized.

Show comment
Hide comment
@dmfs

dmfs Oct 2, 2014

Contributor

FTR: even RFC 1738 allows unencoded '@' in the path, so that's not a new "feature" of RFC 3986

Contributor

dmfs commented Oct 2, 2014

FTR: even RFC 1738 allows unencoded '@' in the path, so that's not a new "feature" of RFC 3986

@liZe liZe modified the milestones: 1.0, 0.10 Oct 22, 2014

@rfc2822

This comment has been minimized.

Show comment
Hide comment
@rfc2822

rfc2822 Dec 7, 2014

If DavDroid crashes upon unencoded '@'s they should fix it in their code.

DAVdroid doesn't crash. It takes all @ characters in paths and encodes them which should always be OK.

Does Radicale handle URLs with %40 exactly the same as unencoded @? In this case, there should be no problem.

FTR: even RFC 1738 allows unencoded '@' in the path, so that's not a new "feature" of RFC 3986

On which part of RFC 1738 is this based upon?

[Page 3] The characters ";", "/", "?", ":", "@", "=" and "&" are the characters which may be reserved for special meaning within a scheme. No other characters may be reserved within a scheme.

Further:

Thus, only alphanumerics, the special characters "$-.+!'(),", and reserved characters _used for their reserved purposes* may be used unencoded within a URL.

So, "@" must only by used for its reserved purpose which is to seperate user name/password and host name in the HTTP scheme.

Even if RFC 3986 allows unencoded "@", I think the way of always encoding "@" should always work, shouldn't it?

rfc2822 commented Dec 7, 2014

If DavDroid crashes upon unencoded '@'s they should fix it in their code.

DAVdroid doesn't crash. It takes all @ characters in paths and encodes them which should always be OK.

Does Radicale handle URLs with %40 exactly the same as unencoded @? In this case, there should be no problem.

FTR: even RFC 1738 allows unencoded '@' in the path, so that's not a new "feature" of RFC 3986

On which part of RFC 1738 is this based upon?

[Page 3] The characters ";", "/", "?", ":", "@", "=" and "&" are the characters which may be reserved for special meaning within a scheme. No other characters may be reserved within a scheme.

Further:

Thus, only alphanumerics, the special characters "$-.+!'(),", and reserved characters _used for their reserved purposes* may be used unencoded within a URL.

So, "@" must only by used for its reserved purpose which is to seperate user name/password and host name in the HTTP scheme.

Even if RFC 3986 allows unencoded "@", I think the way of always encoding "@" should always work, shouldn't it?

@untitaker

This comment has been minimized.

Show comment
Hide comment
@untitaker

untitaker Dec 7, 2014

Contributor

Does Radicale handle URLs with %40 exactly the same as unencoded @? In this case, there should be no problem.

No, it always decodes to @, given either. This is wrong behavior, but given that @ doesn't have to be encoded (which is a different thing to discuss), it's not very problematic. @liZe maybe #158 should be re-opened because of this?

On which part of RFC 1738 is this based upon?

[Page 3] The characters ";", "/", "?", ":", "@", "=" and "&" are the characters which may be reserved for special meaning within a scheme. No other characters may be reserved within a scheme.

It's allowed in URIs: http://tools.ietf.org/html/rfc3986#section-3.3

Note that your quotes only talk about URL schemes (not the path section). In that RFC, the path component is defined as:

urlpath        = *xchar    ; depends on protocol see section 3.1
xchar          = unreserved | reserved | escape
reserved       = ";" | "/" | "?" | ":" | "@" | "&" | "="
Contributor

untitaker commented Dec 7, 2014

Does Radicale handle URLs with %40 exactly the same as unencoded @? In this case, there should be no problem.

No, it always decodes to @, given either. This is wrong behavior, but given that @ doesn't have to be encoded (which is a different thing to discuss), it's not very problematic. @liZe maybe #158 should be re-opened because of this?

On which part of RFC 1738 is this based upon?

[Page 3] The characters ";", "/", "?", ":", "@", "=" and "&" are the characters which may be reserved for special meaning within a scheme. No other characters may be reserved within a scheme.

It's allowed in URIs: http://tools.ietf.org/html/rfc3986#section-3.3

Note that your quotes only talk about URL schemes (not the path section). In that RFC, the path component is defined as:

urlpath        = *xchar    ; depends on protocol see section 3.1
xchar          = unreserved | reserved | escape
reserved       = ";" | "/" | "?" | ":" | "@" | "&" | "="
@dmfs

This comment has been minimized.

Show comment
Hide comment
@dmfs

dmfs Dec 7, 2014

Contributor

FTR: even RFC 1738 allows unencoded '@' in the path, so that's not a new "feature" of RFC 3986

On which part of RFC 1738 is this based upon?

On section 5:

; HTTP

httpurl        = "http://" hostport [ "/" hpath [ "?" search ]]
hpath          = hsegment *[ "/" hsegment ]
hsegment       = *[ uchar | ";" | ":" | "@" | "&" | "=" ]
search         = *[ uchar | ";" | ":" | "@" | "&" | "=" ]

Even if RFC 3986 allows unencoded "@", I think the way of always encoding "@" should always work, shouldn't it?

With the same argument you could just encode pretty much all characters in the path sections, which would be valid as well.
Since @ is allowed to be unencoded in the path, never encoding it should always work, shouldn't it? ;-)

Contributor

dmfs commented Dec 7, 2014

FTR: even RFC 1738 allows unencoded '@' in the path, so that's not a new "feature" of RFC 3986

On which part of RFC 1738 is this based upon?

On section 5:

; HTTP

httpurl        = "http://" hostport [ "/" hpath [ "?" search ]]
hpath          = hsegment *[ "/" hsegment ]
hsegment       = *[ uchar | ";" | ":" | "@" | "&" | "=" ]
search         = *[ uchar | ";" | ":" | "@" | "&" | "=" ]

Even if RFC 3986 allows unencoded "@", I think the way of always encoding "@" should always work, shouldn't it?

With the same argument you could just encode pretty much all characters in the path sections, which would be valid as well.
Since @ is allowed to be unencoded in the path, never encoding it should always work, shouldn't it? ;-)

@rfc2822

This comment has been minimized.

Show comment
Hide comment
@rfc2822

rfc2822 Dec 7, 2014

No, it always decodes to @, given either. This is wrong behavior, but given that @ doesn't have to be encoded (which is a different thing to discuss), it's not very problematic.

I'd say it's very problematic because the possiblity of encoding is essential part of URL behaviour. I have seen various DAV servers returning sometimes "@" and sometimes "%40" and have spent much time in finding a solution that works everywhere. Being liberal in receiving and strict in sending (i.e. always encoding @ as %40) seems to be a good and standards-compliant solution.

The issue is most problematical in combination with parsing a multi-response. When DAVdroid requests properties for /my%40path/, and the multi response returns info for /my@path/ and children, how is the client supposed to find whether /my%40path/ and /my@path/ are the same (which is required to know whether the properties are really for the requested resource)? This only works if encoding is handled correctly.

With the same argument you could just encode pretty much all characters in the path sections, which would be valid as well.

Exactly. It mustn't make any difference to a HTTP client/server whether all characters are encoded or not. Entering http://www.myserver.com/abc into a browser must be treated equally to entering http://www.myserver.com/%61%62%63.

Since @ is allowed to be unencoded in the path, never encoding it should always work, shouldn't it? ;-)

No, see above. Many servers have different solutions, workaround and constraints – the only thing that should always work is to encode "@".

rfc2822 commented Dec 7, 2014

No, it always decodes to @, given either. This is wrong behavior, but given that @ doesn't have to be encoded (which is a different thing to discuss), it's not very problematic.

I'd say it's very problematic because the possiblity of encoding is essential part of URL behaviour. I have seen various DAV servers returning sometimes "@" and sometimes "%40" and have spent much time in finding a solution that works everywhere. Being liberal in receiving and strict in sending (i.e. always encoding @ as %40) seems to be a good and standards-compliant solution.

The issue is most problematical in combination with parsing a multi-response. When DAVdroid requests properties for /my%40path/, and the multi response returns info for /my@path/ and children, how is the client supposed to find whether /my%40path/ and /my@path/ are the same (which is required to know whether the properties are really for the requested resource)? This only works if encoding is handled correctly.

With the same argument you could just encode pretty much all characters in the path sections, which would be valid as well.

Exactly. It mustn't make any difference to a HTTP client/server whether all characters are encoded or not. Entering http://www.myserver.com/abc into a browser must be treated equally to entering http://www.myserver.com/%61%62%63.

Since @ is allowed to be unencoded in the path, never encoding it should always work, shouldn't it? ;-)

No, see above. Many servers have different solutions, workaround and constraints – the only thing that should always work is to encode "@".

@untitaker

This comment has been minimized.

Show comment
Hide comment
@untitaker

untitaker Dec 7, 2014

Contributor

FWIW, I currently use this for "normalizing" hrefs, which will decode any @ chars. It's terrible, but apparently works:

def _normalize_href(base, href, decoding_rounds=1):
    '''Normalize the href to be a path only relative to hostname and
    schema.'''
    if not href:
        raise ValueError(href)
    x = utils.urlparse.urljoin(base, href)
    x = utils.urlparse.urlsplit(x).path

    for i in range(decoding_rounds):
        x = utils.compat.urlunquote(x)

    x = utils.compat.urlquote(x, '/@')
    return x

I set decoding_rounds=2 when doing PROPFIND requests because of owncloud/contacts#581

This decodes the path segment once (and twice for PROPFIND requests), and then encodes all unsafe characters except for /@.

Contributor

untitaker commented Dec 7, 2014

FWIW, I currently use this for "normalizing" hrefs, which will decode any @ chars. It's terrible, but apparently works:

def _normalize_href(base, href, decoding_rounds=1):
    '''Normalize the href to be a path only relative to hostname and
    schema.'''
    if not href:
        raise ValueError(href)
    x = utils.urlparse.urljoin(base, href)
    x = utils.urlparse.urlsplit(x).path

    for i in range(decoding_rounds):
        x = utils.compat.urlunquote(x)

    x = utils.compat.urlquote(x, '/@')
    return x

I set decoding_rounds=2 when doing PROPFIND requests because of owncloud/contacts#581

This decodes the path segment once (and twice for PROPFIND requests), and then encodes all unsafe characters except for /@.

@rfc2822

This comment has been minimized.

Show comment
Hide comment
@rfc2822

rfc2822 Dec 7, 2014

No, it always decodes to @, given either.

Just to ask again … so when DAVdroid sends %40, will Radicale decode it to @? If it does so, I don't know where the problem is because DAVdroid always sends @ as %40, which is decoded by Radicale as you say…

rfc2822 commented Dec 7, 2014

No, it always decodes to @, given either.

Just to ask again … so when DAVdroid sends %40, will Radicale decode it to @? If it does so, I don't know where the problem is because DAVdroid always sends @ as %40, which is decoded by Radicale as you say…

@untitaker

This comment has been minimized.

Show comment
Hide comment
@untitaker

untitaker Dec 7, 2014

Contributor

@rfc2822 The problem is that if the client sends foo%40bar in a multiget request, Radicale will respond with foo@bar.

Contributor

untitaker commented Dec 7, 2014

@rfc2822 The problem is that if the client sends foo%40bar in a multiget request, Radicale will respond with foo@bar.

@untitaker

This comment has been minimized.

Show comment
Hide comment
@untitaker

untitaker Dec 7, 2014

Contributor

I also had similar problems with Zimbra, see pimutils/vdirsyncer#49

Contributor

untitaker commented Dec 7, 2014

I also had similar problems with Zimbra, see pimutils/vdirsyncer#49

@rfc2822

This comment has been minimized.

Show comment
Hide comment
@rfc2822

rfc2822 Dec 7, 2014

@rfc2822 The problem is that if the client sends foo%40bar in a multiget request, Radicale will respond with foo@bar.

That should be no problem. DAVdroid always "sanitizes" all incoming URLs, making foo@bar to foo%40bar for exactly that reason. Maybe there is actually no problem and the original bug report was wrong. I'll check again now, but I have to install Radicale 0.9.1 first.

rfc2822 commented Dec 7, 2014

@rfc2822 The problem is that if the client sends foo%40bar in a multiget request, Radicale will respond with foo@bar.

That should be no problem. DAVdroid always "sanitizes" all incoming URLs, making foo@bar to foo%40bar for exactly that reason. Maybe there is actually no problem and the original bug report was wrong. I'll check again now, but I have to install Radicale 0.9.1 first.

@untitaker

This comment has been minimized.

Show comment
Hide comment
@untitaker

untitaker Dec 7, 2014

Contributor

DAVdroid always "sanitizes" all incoming URLs, making foo@bar to foo%40bar for exactly that reason

How does it detect if it has to encode? If the server actually responded with foo%40bar, would DAVDroid satinize this to foo%2540bar?

Contributor

untitaker commented Dec 7, 2014

DAVdroid always "sanitizes" all incoming URLs, making foo@bar to foo%40bar for exactly that reason

How does it detect if it has to encode? If the server actually responded with foo%40bar, would DAVDroid satinize this to foo%2540bar?

@rfc2822

This comment has been minimized.

Show comment
Hide comment
@rfc2822

rfc2822 Dec 7, 2014

How does it detect if it has to encode? If the server actually responded with foo%40bar, would DAVDroid satinize this to foo%2540bar?

When the server sends foo%40bar, DAVdroid assumes that it sends a valid URL which means (decoded) foo@bar.
When the server sends foo%2540bar, DAVdroid assumes that it sends a valid URL which means (decoded) foo%40bar.

rfc2822 commented Dec 7, 2014

How does it detect if it has to encode? If the server actually responded with foo%40bar, would DAVDroid satinize this to foo%2540bar?

When the server sends foo%40bar, DAVdroid assumes that it sends a valid URL which means (decoded) foo@bar.
When the server sends foo%2540bar, DAVdroid assumes that it sends a valid URL which means (decoded) foo%40bar.

@untitaker

This comment has been minimized.

Show comment
Hide comment
@untitaker

untitaker Dec 7, 2014

Contributor

That sounds straightforward, but from my experience this wouldn't pass my testsuite against both Radicale and ownCloud. I'll try again.

Contributor

untitaker commented Dec 7, 2014

That sounds straightforward, but from my experience this wouldn't pass my testsuite against both Radicale and ownCloud. I'll try again.

untitaker added a commit to pimutils/vdirsyncer that referenced this issue Dec 7, 2014

Rewrite DAV storages' encoding behavior
This is more explicit than the old behavior. See
Kozea/Radicale#128 for the discussion that led
to this.
@untitaker

This comment has been minimized.

Show comment
Hide comment
@untitaker

untitaker Dec 7, 2014

Contributor

I now wrote pimutils/vdirsyncer#147 which seems to work with stable Radicale and ownCloud. I still decode twice on PROPFIND requests though.

Contributor

untitaker commented Dec 7, 2014

I now wrote pimutils/vdirsyncer#147 which seems to work with stable Radicale and ownCloud. I still decode twice on PROPFIND requests though.

untitaker added a commit to pimutils/vdirsyncer that referenced this issue Dec 7, 2014

Rewrite DAV storages' encoding behavior
This is more explicit than the old behavior. See
Kozea/Radicale#128 for the discussion that led
to this.
@dmfs

This comment has been minimized.

Show comment
Hide comment
@dmfs

dmfs Dec 7, 2014

Contributor

Being liberal in receiving and strict in sending (i.e. always encoding @ as %40)

What makes you think that encoding @ is more "strict" than not encoding it? If anything, I'd say it's the opposite (looking at the BNF above).

how is the client supposed to find whether /my%40path/ and /my@path/ are the same

By doing URI normalization, i.e. decoding any character that doesn't have to be encoded. How do you compare /%6D%79%70%61%74%68/ and /mypath/?

the only thing that should always work is to encode "@".

What makes you think so? No, it's not. The specs clearly say that @ is allowed to be un-encoded in paths. So it should always work!

Contributor

dmfs commented Dec 7, 2014

Being liberal in receiving and strict in sending (i.e. always encoding @ as %40)

What makes you think that encoding @ is more "strict" than not encoding it? If anything, I'd say it's the opposite (looking at the BNF above).

how is the client supposed to find whether /my%40path/ and /my@path/ are the same

By doing URI normalization, i.e. decoding any character that doesn't have to be encoded. How do you compare /%6D%79%70%61%74%68/ and /mypath/?

the only thing that should always work is to encode "@".

What makes you think so? No, it's not. The specs clearly say that @ is allowed to be un-encoded in paths. So it should always work!

@rfc2822

This comment has been minimized.

Show comment
Hide comment
@rfc2822

rfc2822 Dec 7, 2014

By doing URI normalization, i.e. decoding any character that doesn't have to be encoded. How do you compare /%6D%79%70%61%74%68/ and /mypath/?

Exactly, that's how DAVdroid does it. The only difference is that it internally always stores '@' in its encoded form %40 and doesn't decode it before sending it to servers.

What makes you think so? No, it's not. The specs clearly say that @ is allowed to be un-encoded in paths. So it should always work!

I might think about always using '@' instead of '%40' but I'm sure it will break compatibility with many servers (believe me, I have spent hours and days with that issue).

What's wrong with sending %40, or /%6D%79%70%61%74%68/ instead of /mypath/ (from the DAVdroid side)?

rfc2822 commented Dec 7, 2014

By doing URI normalization, i.e. decoding any character that doesn't have to be encoded. How do you compare /%6D%79%70%61%74%68/ and /mypath/?

Exactly, that's how DAVdroid does it. The only difference is that it internally always stores '@' in its encoded form %40 and doesn't decode it before sending it to servers.

What makes you think so? No, it's not. The specs clearly say that @ is allowed to be un-encoded in paths. So it should always work!

I might think about always using '@' instead of '%40' but I'm sure it will break compatibility with many servers (believe me, I have spent hours and days with that issue).

What's wrong with sending %40, or /%6D%79%70%61%74%68/ instead of /mypath/ (from the DAVdroid side)?

untitaker added a commit to pimutils/vdirsyncer that referenced this issue Dec 7, 2014

Rewrite DAV storages' encoding behavior
This is more explicit than the old behavior. See
Kozea/Radicale#128 for the discussion that led
to this.

untitaker added a commit to pimutils/vdirsyncer that referenced this issue Dec 7, 2014

Rewrite DAV storages' encoding behavior
This is more explicit than the old behavior. See
Kozea/Radicale#128 for the discussion that led
to this.

childnode referenced this issue in owncloudarchive/contacts Aug 31, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment