Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combining shrinking with resource usage is seemingly unsupported. #331

Open
NorfairKing opened this issue May 21, 2021 · 12 comments
Open

Combining shrinking with resource usage is seemingly unsupported. #331

NorfairKing opened this issue May 21, 2021 · 12 comments

Comments

@NorfairKing
Copy link

NorfairKing commented May 21, 2021

Ref hspec/hspec#531

It looks like quickcheck assumes that properties don't use any resources while they are being tested.

In particular reduceRose is called before starting shrinking:

MkRose res ts <- protectRose (reduceRose (unProp (unGen (unProperty f_or_cov) rnd1 size)))

This uses the assumption that the IO actions in IORose don't clean up resources that could affect the outcome of a test. (I think, please correct me if I'm wrong.)

Is this fixable? If not I might have to write my own Property tester to support this use-case.

@NorfairKing NorfairKing changed the title Combining shrinking with resource usage is broken. Combining shrinking with resource usage is seemingly unsupported. May 21, 2021
@phadej
Copy link
Contributor

phadej commented May 21, 2021

It's unclear to me what hspec does to construct & run ioProperty. If I run

module Main where

import Test.QuickCheck
import Control.Concurrent.STM

main :: IO ()
main = do
    var <- newTVarIO True
    let withTrue func = do
            atomically $ writeTVar var True
            r <- func
            atomically $ writeTVar var False
            pure r

    quickCheck $ forAllShrink (sized $ \ n -> pure n ) shrink $ \i -> ioProperty $ withTrue $ do
        b <- readTVarIO var
        if b
        then return (i < 20)
        else return (error "Invariant violated?") -- this should never happen, do I understand right?

I get

*** Failed! Falsified (after 21 tests):                  
20

which is expected failure, isn't it?

@NorfairKing
Copy link
Author

@NorfairKing
Copy link
Author

For the record: I just spent a few minutes trying to "just" get rid of that reduceRose but it turned out to be significantly difficult.

@nick8325
Copy link
Owner

nick8325 commented May 21, 2021

I'm not sure what you mean by "properties don't use any resources while they are being tested", but at least I can try to describe the situation.

The call to reduceRose executes only the IO that's needed to evaluate the property to a result (pass/fail). We for sure have to do this during testing, not only during shrinking, so I don't see any way to eliminate this call: if the property does IO, this is the line that actually executes the IO. If there's extra IO to be done during shrinking, that's supposed to appear as internal IORose nodes in the rose tree.

At the user level we have (as you noted) ioProperty and idempotentIOProperty. ioProperty disables shrinking, and idempotentIOProperty does not guarantee to re-execute the IO during shrinking (it can instead re-use the property that was returned when it last executed the IO). Note that ioProperty p only disables shrinking of any forAlls inside p, so as long as you can put all the quantifiers outside of the call to ioProperty it works fine.

What you would of course like is for ioProperty to always execute its IO action, and to shrink the returned property. But this is problematic, because in a call ioProperty p, p may do some IO, and then do quantification which is different depending on what the IO returned. Here is an example:

prop_vector :: Int -> Property
prop_vector n =
  forAllShrink (vector n) shrink $ \xs -> ...

prop_funny :: Property
prop_funny =
  ioProperty $ do
    n <- some nondeterminstic IO Int
    return (prop_vector n)

Here, prop_vector n produces a rose tree which follows the shrinking steps for a list of length n... but if n changes, the returned rose tree has a completely different shape! So if we re-execute the IO, we have to throw away any shrinking that we did on the vector. That's why ioProperty doesn't shrink its argument. I don't know of any way around this, unfortunately.

@nick8325
Copy link
Owner

Just to be more explicit, here QuickCheck is able to shrink the counterexample (note the "2 shrinks" below):

module Main where

import Test.QuickCheck
import Control.Concurrent.STM

main :: IO ()
main = do
    var <- newTVarIO True
    let withTrue func = do
            atomically $ writeTVar var True
            r <- func
            atomically $ writeTVar var False
            pure r

    quickCheck $ \i -> ioProperty $ withTrue $ do
        b <- readTVarIO var
        if b
        then return (i < 20)
        else return (error "Invariant violated?") -- this should never happen, do I understand right?

> main
*** Failed! Falsified (after 27 tests and 2 shrinks):    
20

@NorfairKing
Copy link
Author

NorfairKing commented May 21, 2021

I'm not sure what you mean by "properties don't use any resources while they are being tested", but at least I can try to describe the situation.

Thank you this has been really helpful already.

This withTrue here is a minimal example of a resource-level invariant that has to be maintained: during the test, the TVar must contain True. A more practical example would be one of a web server that needs to be running during a test, where instead of a boolean you would get a port number to call the web server on.
When a test fails in that use-case, the web server needs to be run again during shrinking, just like it would be during any other run of the test on a single example.
Right now when we do this, the test fails when rerun during shrinking because the web server is no longer running at that point.

So what I want to have happen is something like:

1. start server
2. Run test with example A , pass
3. stop server
4. start server
5. Run test with example B, fail
6. stop server
7. start server
8. Run with shrunk version of B
9. stop server 
...

(Here the "start server" and "stop server" would be supplied as a ((PortNumber -> IO r) -> IO r) function, not two separate functions.

When implementing a testing framework like hspec does, (or like I'm doing in sydtest, where I also encountered this problem.), the framework just gets a Property and must be able to run it with resources defined by the framework. This is where around and aroundAll come in.
In other words, I'm trying to implement this function:

runPropertyTestWithArg ::
  (outerArgs -> innerArg -> Property) ->
  TestRunSettings ->
  ((outerArgs -> innerArg -> IO r) -> IO r) ->
  IO TestRunResult

Here the first argument is the function that produces the property under test. outerArgs would be something like Http.Manager (set up once, around all tests) and innerArg would be something like a PortNumber (set up around every test).
It's important that the PortNumber is set up around every test, including every test with a shrunk version of a failing counterexample, otherwise the shrunk counterexample causes a different test failure than the original counterexample because the server is not running.

Note that ioProperty p only disables shrinking of any forAlls inside p, so as long as you can put all the quantifiers outside of the call to ioProperty it works fine.

Because we just get a Property on the framework level, I don't think this is possible.
Would there be a way to push the wrapper function (to set up the server) down into the forAlls?

EDIT: I would be happy to go into detail on a video call if that would help! I'm a very big user of QuickCheck and I care deeply about property testing so I hope I can help.

@NorfairKing
Copy link
Author

For the record, I tried rewriting my aroundProperty using idempotentIOProperty but I think then I run into the situation where the IO is not rerun during shrinking:

aroundProperty :: ((a -> b -> IO ()) -> IO ()) -> (a -> b -> Property) -> Property
aroundProperty action p =
  idempotentIOProperty $
    applySimpleWrapper2'
      action
      ( \a b -> do
          evaluate $ p a b
      )

shrinking_000

@nick8325
Copy link
Owner

I think then I run into the situation where the IO is not rerun during shrinking

Yes, I think so. The way idempotentIOProperty p works is: when a variable quantified outside p is shrunk, the IO gets rerun, but if the variable is quantified inside p, the result of the IO gets reused.

About the function you want to implement:

runPropertyTestWithArg ::
  (outerArgs -> innerArg -> Property) ->
  TestRunSettings ->
  ((outerArgs -> innerArg -> IO r) -> IO r) ->
  IO TestRunResult

Here, the problem is that your outerArgs -> innerArg -> Property function could do different quantifications depending on the value of innerArg, for example if innerArg is used to make decisions in a generator. If it does this, then I don't know how to do shrinking (and I also don't know how to detect at runtime if it does this).

One workaround could be to pass the property an IO innerArg rather than an innerArg. The user would have to be responsible for using ioProperty to extract the innerArg, and could put the call as deep as possible. But it's not ideal... I'll have to think about this some more.

@NorfairKing
Copy link
Author

NorfairKing commented May 21, 2021

@nick8325 I have just spent the last few hours completely nerd-sniped trying to convince myself that what I was trying was at all possible.

Here is what I've done. It is just a sketch but the output looks promising: https://gist.github.com/NorfairKing/ff8ede08df7ee809be01e57f4634f197

And indeed, we see this output:

Starting the test
30
test failed
CallStack (from HasCallStack):
  error, called at /home/syd/src/sydtest/sydtest/src/Test/Syd/Scratchpad.hs:141:43 in main:Test.Syd.Scratchpad
Starting to shrink
Running the wrapper while shrinking
30
test failed
CallStack (from HasCallStack):
  error, called at /home/syd/src/sydtest/sydtest/src/Test/Syd/Scratchpad.hs:141:43 in main:Test.Syd.Scratchpad
("Trying smaller version",0)
0
("result",True)
("Trying smaller version",15)
15
test failed
CallStack (from HasCallStack):
  error, called at /home/syd/src/sydtest/sydtest/src/Test/Syd/Scratchpad.hs:141:43 in main:Test.Syd.Scratchpad
("result",False)
("Trying smaller version",0)
0
("result",True)
("Trying smaller version",8)
8
test failed
CallStack (from HasCallStack):
  error, called at /home/syd/src/sydtest/sydtest/src/Test/Syd/Scratchpad.hs:141:43 in main:Test.Syd.Scratchpad
("result",False)
("Trying smaller version",0)
0
("result",True)
("Trying smaller version",4)
4
test failed
CallStack (from HasCallStack):
  error, called at /home/syd/src/sydtest/sydtest/src/Test/Syd/Scratchpad.hs:141:43 in main:Test.Syd.Scratchpad
("result",False)
("Trying smaller version",0)
0
("result",True)
("Trying smaller version",2)
2
("result",True)
("Trying smaller version",3)
3
test failed
CallStack (from HasCallStack):
  error, called at /home/syd/src/sydtest/sydtest/src/Test/Syd/Scratchpad.hs:141:43 in main:Test.Syd.Scratchpad
("result",False)
("Trying smaller version",0)
0
("result",True)
("Trying smaller version",2)
2
("result",True)
("return the failure",3)
False

I really thought that I understood what you were saying, so now I no longer understand why this works. I had convinced myself that it wasn't possible.
You'll see in the code:

        -- HERE BE THE DRAGONS
        --
        -- What the hell do we do here?
        -- If we don't rerun the wrapper then the invariants of the wrappers are not maintained
        -- If we do rerun the wrapper then all of the shrinking below is thrown away.

Now I'm going to take a break, this was infuriating :p

@NorfairKing
Copy link
Author

Here, the problem is that your outerArgs -> innerArg -> Property function could do different quantifications depending on the value of innerArg, for example if innerArg is used to make decisions in a generator. If it does this, then I don't know how to do shrinking (and I also don't know how to detect at runtime if it does this).

Is there a way to make this work if I can get the user to pinkie-promise that the inner resource is not used to change the shape of the generator or shrinking function?

One workaround could be to pass the property an IO innerArg rather than an innerArg. The user would have to be responsible for using ioProperty to extract the innerArg, and could put the call as deep as possible. But it's not ideal... I'll have to think about this some more.

Yes I could actually just pass the entire ((outerArgs -> innerArg -> IO r) -> IO r) to the user and have them deal with it. That would be a nice workaround if it turns out what I'm trying isn't possible.

@NorfairKing
Copy link
Author

NorfairKing commented May 22, 2021

This been on my mind for a while now and I think I've been able to explain to myself why this seems to work.
The shrinking happens from the outside inwards, so by the time you would be "throwing away any shrinking below here", no shrinking has happened below there yet.

EDIT: I just figured out that what I made is actually still broken because the wrapper is not re-executed per run of a shrunken example's test. This means that a test could leak state into the next run and break the whole idea. I also need to think about this a bit more..

EDIT again: I think I may have figured something out, but it required quite the definition of how properties work.
see this for my working idea: https://github.com/NorfairKing/sydtest/blob/sydcheck/sydtest/src/Test/Syd/Scratchpad.hs
The big difference is that the idea of the resource has to be "baked in" somehow such that you can push the resource rapper to inside the quantifications. In the meantime I think the workaround idea that we discussed can be used in sydtest: No shrinking when resources are at play, shrinking can be activated by passing the entire ((outerArgs -> innerArg -> IO r) -> IO r) to the user.

@NorfairKing
Copy link
Author

Gentle ping.

In the meantime I've added this to sydtest to turn off shrinking when resources are in use: NorfairKing/sydtest@bbc7dcb#diff-f4f9ca3eba6b0fecf61df5940928227897178cad50db39356c016015382e29e8R178

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants