http-request when URI is a puri:uri -- more silent coercion problems #13

mon-key · 2012-03-20T22:43:52Z

in lieu of your recent fix I've now noticed that when URI arg to http-request is a puri:uri we have a similar situation as when when URI is a string such that the following do not return equivalently:

(http-request #u"http://id.loc.gov/vocabulary/graphicMaterials/label/Action%20%26%20adventure%20dramas"
                     :preserve-uri t :method :head)

(drakma:http-request "http://id.loc.gov/vocabulary/graphicMaterials/label/Action%20%26%20adventure%20dramas"
                              :preserve-uri t :method :head)

It is beyond me whether this differences in return value constitutes a bug or not.

This said, I would like to point out that if it is considered a bug then the possible fixes around puri may not be quite so trivial as they were for strings esp. b/c puri:parse-uri breaks percent-encoded non-ASCII characters by silently coercing them to goo.

This returns:

(drakma:http-request "http://id.loc.gov/vocabulary/graphicMaterials/label/A%20la%20poup%C3%A9e%20prints"
                     :preserve-uri t 
                     :method :head)

These don't:

(drakma:http-request #u"http://id.loc.gov/vocabulary/graphicMaterials/label/A%20la%20poup%C3%A9e%20prints"
                     :preserve-uri t 
                     :method :head)

(drakma:http-request
 (puri:parse-uri "http://id.loc.gov/vocabulary/graphicMaterials/label/A%20la%20poup%C3%A9e%20prints")
 :preserve-uri t 
 :method :head)

additional discussion here:
http://paste.lisp.org/+2R44

Also, maybe these are relevant:
https://github.com/archimag/puri-unicode
https://github.com/franzinc/uri

tmccombs · 2015-03-26T05:46:02Z

puri isn't converting the percent encodings into goo, it is converting them into the latin1 encoding for the percent encoding. In your example %C3 is Ã and %A9 is © in latin1. But "Ã©" in the latin1 encoding is the same as é in UTF-8.

However, the RFC for urls (1738) says

Octets must be encoded if they have no corresponding graphic
character within the US-ASCII coded character set, if the use of the
corresponding character is unsafe, or if the corresponding character
is reserved for some other interpretation within the particular URL
scheme.

So, puri should not be un-encoding the percent encodings to non-ascii characters.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

http-request when URI is a puri:uri -- more silent coercion problems #13

http-request when URI is a puri:uri -- more silent coercion problems #13

mon-key commented Mar 20, 2012

tmccombs commented Mar 26, 2015

http-request when URI is a puri:uri -- more silent coercion problems #13

http-request when URI is a puri:uri -- more silent coercion problems #13

Comments

mon-key commented Mar 20, 2012

tmccombs commented Mar 26, 2015