Skip to content
This repository has been archived by the owner on Oct 9, 2019. It is now read-only.

Non-printable characters aren't escaped #217

Open
rhofour opened this issue Dec 2, 2017 · 10 comments
Open

Non-printable characters aren't escaped #217

rhofour opened this issue Dec 2, 2017 · 10 comments

Comments

@rhofour
Copy link

rhofour commented Dec 2, 2017

Right now it's impossible to tell he difference between String.fromList [ '\1', '\2' ] and an empty string when used as an input to a fuzz test. This caused some confusion when I thought the input was an empty string.

I'm not sure what the right solution is here, but as-is this a bit of a problem.

Perhaps we could check if a string contains unprintable characters and display it as: String.fromList [...] instead of the usual way.

Alternatively, we could add an option to change how the "Given:" portion of fuzz tests is displayed. That way I could print my data in the way that's most helpful for me, but we could keep the default unchanged.

@mgold
Copy link
Member

mgold commented Dec 2, 2017

Is this an issue with elm-test itself or the node runner?

@drathier
Copy link
Collaborator

drathier commented Dec 3, 2017

If this is a string generated by Fuzz.string, we're waiting for a patch release of elm-core, so we can release #204, which should fix this (assuming you have unicode support in your terminal).

@rhofour
Copy link
Author

rhofour commented Dec 3, 2017

This isn't from a string generated by Fuzz.string. I'm fuzzing binary strings containing characters 0 to 127 and running into this issue because a lot of them aren't printable (and none are unicode characters).

I don't know if the problem is in elm-test or the node runner.

@drathier
Copy link
Collaborator

drathier commented Dec 3, 2017

Okey, two options then. Either elm-test decides what's printable, and outputs hex digits or similar if there's an unprintable character in a string, or the runners do the same thing. The main issue is figuring out what's considered printable, and what isn't. Maybe we can just show it as "" (hex: [0x01, 0x02]), rather than "", if there's a single unprintable in there?

@rhofour
Copy link
Author

rhofour commented Dec 3, 2017

Your suggestion for how to display it sounds good to me. For ascii detecting if it's printable should be pretty easy, but I don't know how to extend it to unicode.

Like I mentioned above, I'd also be happy having a user-settable printing function. That way I can always have it print the hex values for my unusual use case. I imagine at least some other people might also have cases where the default printing isn't ideal.

@rhofour
Copy link
Author

rhofour commented Dec 24, 2017

It looks like the expected and actual parts are formatted in Expect.testWith:

testWith : (String -> String -> Reason) -> String -> (a -> b -> Bool) -> b -> a -> Expectation            
testWith makeReason label runTest expected actual =  
    if runTest actual expected then                  
        pass                                         
    else                                             
        { description = label                        
        , reason = makeReason (toString expected) (toString actual)                                       
        }                                            
            |> Test.Expectation.fail

I'd like to propose attaching formatters to Expectations. These would default to toString for simplicity and backwards compatibility, but could be changed in less common cases like mine.

Could I get something like merged in?

@rtfeldman
Copy link
Member

I'd like to propose attaching formatters to Expectations. These would default to toString for simplicity and backwards compatibility, but could be changed in less common cases like mine.

I'd much prefer @drathier's proposed solution:

elm-test decides what's printable, and outputs hex digits or similar if there's an unprintable character in a string

@rhofour if you're interested in working on this approach, I think a good next step would be to open another issue proposing how this would work!

@rhofour
Copy link
Author

rhofour commented Dec 24, 2017

I think in that case the right place to implement it might be in Elm's toString. Otherwise we'll end up pretty dependent on the details of how toString works and likely with something that doesn't look as nice.

For example, consider something like this:

type Foo = Foo String String
x = Foo "test" "a\0b\1c\2d"

If you run toString on x you get:

Foo "test" "a\0bcd"

Where the bytes 0x01 and 0x02 are printed, but don't display because they're non-printable.

If we follow @drathier's suggestion we would have to go back and undo Elm's escaping, then convert everything to something like hex digits. Afterwards we'd end up with something like this:

<hexstring 0x466f6f2022746573742220226100620163026422>

We could do better if we could write our own version of toString which only escapes strings and returns something like this:

Foo "test" <hexstring 0x61006201630264>

However, this isn't possible in Elm because we can't write a function that's polymorphic over all types. Instead, we would have to modify toString to behave like this and I have no idea if something like that would be accepted upstream.

However, we can accomplish something similar if we have the user write a formatter specifically for their type. That's why I'm suggesting this.

@rtfeldman
Copy link
Member

rtfeldman commented Dec 24, 2017

@rhofour - sorry, would you mind opening a new issue (in this repo) to discuss the implementation? I don't want that thread to get mixed with this one. 😅

@rhofour
Copy link
Author

rhofour commented Dec 24, 2017

I opened #224 specifically for adding custom formatters.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants