Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add decode function. #12

Merged
merged 2 commits into from Aug 14, 2019

Conversation

@nsaunders
Copy link
Contributor

commented Aug 2, 2019

What does this pull request do?

The purpose of this pull request is to extend the library with a function that decodes form data into the FormURLEncoded structure.

Where should the reviewer start?

It's a pretty small change.

How should this be manually tested?

Here are my test cases:

> encode $ decode "a=aa&b=bb"
"a=aa&b=bb"

> encode $ decode "this=this%3Dthis"
"this=this%3Dthis"

> encode $ decode "&x=x&&y=y&z="
"&x=x&&y=y&z="

> decode "a=aa&b=bb"
(FormURLEncoded [(Tuple "a" (Just "aa")),(Tuple "b" (Just "bb"))])

> decode "this=this%3Dthis"
(FormURLEncoded [(Tuple "this" (Just "this=this"))])

> decode "&x=x&&y=y&z="
(FormURLEncoded [(Tuple "" Nothing),(Tuple "x" (Just "x")),(Tuple "" Nothing),(Tuple "y" (Just "y")),(Tuple "z" (Just ""))])

Other Notes:

I already updated the README, and I don't think any follow-up steps are necessary other than cutting a release.

@thomashoneyman thomashoneyman self-assigned this Aug 3, 2019

decodePart = String.split (Pattern "=") >>> case _ of
[k, v] -> Just $ Tuple (unsafeDecodeURIComponent k) (Just $ unsafeDecodeURIComponent v)
[k] -> Just $ Tuple (unsafeDecodeURIComponent k) Nothing
_ -> Nothing

This comment has been minimized.

Copy link
@thomashoneyman

thomashoneyman Aug 3, 2019

Member

I'm not particularly familiar with decodeURIComponent, but is it possible that a key or value could cause an exception? For example, the MDN documentation provides this pair of examples, the first of which ought to succeed and the second of which ought to fail:

decodeURIComponent('JavaScript_%D1%88%D0%B5%D0%BB%D0%BB%D1%8B');
// "JavaScript_шеллы"

try { 
  var a = decodeURIComponent('%E0%A4%A'); 
} catch(e) { 
  console.error(e); 
}

// URIError: malformed URI sequence

neither of which contains the explicit pattern = or &, and I'm not sure which part of it causes the exception thrown in the second case.

Do you mind clarifying how the decode function avoids exceptions, and handles the same cases that the JavaScript decodeURIComponent function does? (Feel free to correct me if this question doesn't seem pertinent).

This comment has been minimized.

Copy link
@thomashoneyman

thomashoneyman Aug 3, 2019

Member

Ideally the decode function in this library would:

  1. Successfully decode the same cases that the decodeURIComponent() function does
  2. Return Nothing for cases that would cause an exception in decodeURIComponent()

This comment has been minimized.

Copy link
@thomashoneyman

thomashoneyman Aug 12, 2019

Member

Now that there are safe versions in -globals, can we update this @nsaunders?

@nsaunders

This comment has been minimized.

Copy link
Contributor Author

commented Aug 3, 2019

Thanks for the response @thomashoneyman. I should have given more consideration to the possibility of decodeURIComponent throwing an error. It turns out that both encodeURIComponent and decodeURIComponent can throw errors although purescript-form-urlencoded's encode function doesn't currently have any notion of failure either. I also looked at purescript-uri's usage of unsafeDecodeURIComponent and usage of unsafeEncodeURIComponent and found no error handling there, either, so I am not entirely sure how to proceed.

I think the main issue is that the type of unsafeDecodeURIComponent and unsafeEncodeURIComponent, i.e. String -> String, doesn't suggest possible failure. So we are left to find a safe alternative, perhaps, to purescript-globals?

Another option would be to resort to the FFI as in purescript-httpure, but, actually, is this modeled correctly? I thought that throwing an error is considered a side effect. (As a relative PureScript newbie, I would love clarification one way or the other!) Assuming that throwing an error is a side effect, I think this would ultimately need to be reflected in the return types of encode and decode, or else we would need to resort to unsafePerformEffect perhaps...

Yet another possibility would be to attempt to validate the input before calling unsafeEncodeURIComponent/unsafeDecodeURIComponent, hopefully avoiding the possibility of a URIError altogether. I have found some examples attempting to do this sort of thing (here, here, and here), but I do wonder about the correctness/exhaustiveness of these solutions.

Please let me know what you think, or if you had another solution in mind. Maybe @garyb would be willing to opine as well, since I believe he has worked on purescript-uri and can see that he is keeping an eye on this issue. Otherwise, thanks again for the feedback!

@garyb

This comment has been minimized.

Copy link
Member

commented Aug 4, 2019

purescript-uri makes use of the unsafe functions "safely" 😉 - it only uses them in a context where the string has been parsed already - they're basically used just as lookup tables to do the actual decode conversion for characters, when we know they're already using a valid encoding. Even the encode stuff isn't used naively - none of the browser-provided encode functions entirely do the right thing for all parts of a URI, so they're only used to encode some characters - again just being used as a lookup really.

I agree that ideally there should be non-unsafe version of en/decodeURI[Component], I think the reason they don't exist in -globals is it's a zero-dependency library. If I had Maybe on hand, I'd implement them to return a value that models the exception that way - there's not any interesting information in the error or stack trace, so just a straight yay-or-nay result would work I think. Maybe it's time -globals got a dependency, I don't know.

And yeah, an exception/Error is most definitely a side effect, and should correspondingly happen in Effect. 🙂

@thomashoneyman

This comment has been minimized.

Copy link
Member

commented Aug 4, 2019

Thanks for the comprehensive response! @garyb made some good points; here are a few thoughts on my end:

It turns out that both encodeURIComponent and decodeURIComponent can throw errors although purescript-form-urlencoded's encode function doesn't currently have any notion of failure either.

This sounds like a problem to me. If encoding and decoding can fail then their types should reflect that -- likely through a Maybe return type.

I also looked at purescript-uri's usage of unsafeDecodeURIComponent and usage of unsafeEncodeURIComponent and found no error handling there.

@garyb noted that there isn't error handling because the author(s) felt confident that they were providing input which would always succeed. So that's a sensible use of the unsafe functions. But it doesn't mean other code should do the same thing.

I think the main issue is that the type of unsafeDecodeURIComponent and unsafeEncodeURIComponent, i.e. String -> String, doesn't suggest possible failure. So we are left to find a safe alternative, perhaps, to purescript-globals?

The use of unsafe* generally denotes "I might throw errors or otherwise do things not ordinarily allowed in PureScript given my type, so be careful," so I don't mind this pair of functions existing with their current types (and disclaimers that they can throw errors).

I think you've raised a good point -- if these are defined in the purescript-globals package, why not update that package's main namespace (Global) to have safe versions of the unsafe module (Globals.Unsafe)? That way we can have decodeURI :: String -> Maybe String and not have to worry about it.

@garyb I don't see a reason why -globals should avoid a dependency on -maybe; it's used basically everywhere else. Is there a reason it ought to be dependency-free?

Another option would be to resort to the FFI as in purescript-httpure, but, actually, is this modeled correctly? I thought that throwing an error is considered a side effect.

Generally that's true -- throwing an error is a side effect -- but a try/catch used in an FFI file is more akin to Maybe. Using it the way HTTPure does is like the function returning null on failure. That makes me comfortable representing the function as String -> Maybe String in PureScript. Perhaps it's not technically correct, but I'd still consider that FFI implementation a side-effect-free function.

@thomashoneyman

This comment has been minimized.

Copy link
Member

commented Aug 4, 2019

My vote would be:

  1. Make safe versions of encodeURIComponent and decodeURIComponent in purescript-globals
  2. Implement them the way HTTPure does, with the try/catch FFI implementation
  3. Import and use them in this library
nsaunders added 2 commits Aug 12, 2019

@nsaunders nsaunders force-pushed the nsaunders:decode branch from e6fe215 to fda176b Aug 12, 2019

@nsaunders nsaunders changed the title Add decode function. [Merge after #13] Add decode function. Aug 12, 2019

@nsaunders

This comment has been minimized.

Copy link
Contributor Author

commented Aug 12, 2019

Updated test cases:

> decode "a=aa&b=bb"
(Just (FormURLEncoded [(Tuple "a" (Just "aa")),(Tuple "b" (Just "bb"))]))

> decode "this=this%3Dthis"
(Just (FormURLEncoded [(Tuple "this" (Just "this=this"))]))

> decode "&x=x&&y=y&z="
(Just (FormURLEncoded [(Tuple "" Nothing),(Tuple "x" (Just "x")),(Tuple "" Nothing),(Tuple "y" (Just "y")),(Tuple "z" (Just ""))]))

> decode "a=b&%8A=c"
Nothing

> encode =<< decode "a=aa&b=bb"
(Just "a=aa&b=bb")

> encode =<< decode "this=this%3Dthis"
(Just "this=this%3Dthis")

> encode =<< decode "&x=x&&y=y&z="
(Just "&x=x&&y=y&z=")

> encode =<< decode "a=b&%8A=c"
Nothing
@thomashoneyman

This comment has been minimized.

Copy link
Member

commented Aug 14, 2019

@garyb this is going to be a breaking change because of the type signature difference. Do you have an opinion on adding unsafeEncode and unsafeDecode functions for folks relying on this to use in the next version? Or should they just use unsafePartial to recover that?

EDIT: Whoops, meant to write this on #13.

@thomashoneyman thomashoneyman changed the title [Merge after #13] Add decode function. Add decode function. Aug 14, 2019

@thomashoneyman

This comment has been minimized.

Copy link
Member

commented Aug 14, 2019

Updated test cases:

@nsaunders actually, do you mind adding a test directory with those tests you've written? It would look something like:

https://github.com/purescript-contrib/purescript-js-date/blob/master/test/Test/Main.purs
https://github.com/thomashoneyman/purescript-slug/blob/master/test/Main.purs

where each of your lines above would be an assert statement. I can then set up CI which will run the tests automatically for future branches.

@garyb

This comment has been minimized.

Copy link
Member

commented Aug 14, 2019

@thomashoneyman yeah, I'd release it with a new breaking version. The old one is "bad" since it can explode, so it'd be better if people updated to the new one and dealt with the fact that encoding can fail 🙂

👍 for adding some tests that run in CI

@thomashoneyman thomashoneyman merged commit 1d7b4ee into purescript-contrib:master Aug 14, 2019

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@thomashoneyman

This comment has been minimized.

Copy link
Member

commented Aug 14, 2019

I've added an issue, #14, for the test cases. I've also updated master to include a test directory and for CI to run those tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.