Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a laziness bug on Data.Array.Mutable.Linear #135

Merged
merged 4 commits into from
Aug 26, 2020

Conversation

utdemir
Copy link
Contributor

@utdemir utdemir commented Aug 20, 2020

This patch fixes an issue where using a previously read value from an array in a write causing an infinite loop.

Here is an example:

{-# LANGUAGE LinearTypes #-}

module Main where

import Data.Function ((&))
import qualified Prelude.Linear as L
import qualified Data.Array.Mutable.Linear as Array
import Prelude.Linear (Unrestricted(..))

processArr :: Array.Array () #-> Unrestricted ()
processArr arr =
  Array.read arr 0 L.& \(arr', Unrestricted before) ->
    Array.write arr' 0 (before `seq` ()) L.& \arr'' ->
      Array.read arr'' 0 L.& \(arr''', after) ->
          arr''' `L.lseq` after

main :: IO ()
main =
   Array.fromList [()] processArr
     & \(Unrestricted a) -> print a

The seq call there is just to explicitly show the dependency. I found this issue while writing a bigger application where I was simply incrementing an Int inside an Array.

I think what happens here is that the before thunk (which contains a Unsafe.readMutArr call inside), is not evaluated until the print statement. However, before it can do that, the relevant cell in the array is replaced with before `seq` () by the write. So, we're unknowingly tying the knot there.

This PR tries to fix the issue with evaluating the value before returning from the read. Other possible solutions I can think of is evaluating the value before write's instead, or doing the same on the lower level Data.MutableArray module.

However, I think there might be other similar problems hidden there; my brain hurts when I try to think about storing more complex unevaluated thunks inside Array's. Please let me know if there is a better solution.

@utdemir utdemir added this to the Initial release milestone Aug 23, 2020
@facundominguez
Copy link
Member

facundominguez commented Aug 24, 2020

Looks like a reasonable fix. A principle to follow in doing in-place updates, is ensuring that all side effects occur in the order in which the array references are created. It is an interesting piece to include in another blogpost about writing interfaces with linear types.

This could deserve a test like:

readAndWrite :: Char -> Array.Array Char #-> Unrestricted Char
readAndWrite c arr =
  Array.read arr 0 L.& \(arr', before) ->
    Array.write arr' 0 c L.& \arr'' ->
      arr'' `L.lseq` before

test_read_write :: Spec
test_read_write =
  it "does not reorder reads and writes" $
   Array.fromList "a" (readAndWrite 'b')
     & \(Unrestricted a) -> a `shouldBe` 'a'

Copy link
Member

@aspiwack aspiwack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts below.

I second @facundominguez 's request for an accompanying test. Subtle things like this can reappear.

Comment on lines 123 to 124
let val = Unsafe.readMutArr mutArr ix
in val `seq` (arr, Unrestricted val)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd do:

Suggested change
let val = Unsafe.readMutArr mutArr ix
in val `seq` (arr, Unrestricted val)
let !val = Unsafe.readMutArr mutArr ix
in (arr, Unrestricted val)

I find it a bit clearer.

Interestingly, this is stricter than it needs to be. That is, if the arr stores undefined at ix, this version of read arr ix will loop, whereas it should be (arr, undefined). At least if our arrays are to have the same semantics as the mutable arrays of base.

Here is an idea, however: instead of storing a, let's store (# a #) instead. (either we do this directly in Unsafe.Array, or we define Array a to be an Unsafe.Array (# a #). I don't know. Then we would have

Suggested change
let val = Unsafe.readMutArr mutArr ix
in val `seq` (arr, Unrestricted val)
let (# val #) = Unsafe.readMutArr mutArr ix
in (arr, Unrestricted val)

As a bonus, we can't get this wrong: the types force us to force the value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alas, this array implementation looks awfully easy to get wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alas, this array implementation looks awfully easy to get wrong.

What is “this” in your sentence?

Copy link
Member

@facundominguez facundominguez Aug 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't imagined that making an array implementation that used linear types would be so tightly concerned with the evaluation order. Putting arrays and linear types together doesn't produce an obviously correct implementation. In retrospective, it is to be expected, maybe.

"This" didn't referred to your fix proposal, but rather to the whole endeavor of providing linear arrays.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, this is stricter than it needs to be.
Here is an idea, however: instead of storing a, let's store (# a #) instead.

I agree that this is stricter. But, in this case what we need to force is Unsafe.readMutArr mutArr ix call rather than ix itself, so actually I was thinking about the opposite, maybe we should store a Identity a inside instead. So then we could store bottoms inside. However it does introduce an overhead, so I'm not sure if we should.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why Identity a? It has the exact same strictness as a. Maybe you meant a data version of Identity, to add an indirection. In which case, yes, there is a cost. That's why I suggested (# a #): exactly the same representation as a, but always forced to constructor form (# x #), without x being forced.

In fact, now that I think of it, we may not even need to store a (# a #), if Unsafe.readMutArr makes sure to return a (# a #) (one has to place the constructor in the right location though).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought forcing an unboxed tuple would also force the value, if that's not the case, that'd be good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An unboxed tuple is always forced. But it can contain a thunk. This is, indeed, the point.

This patch fixes an issue where using a previously read value from an array in a write causing an infinite loop.

Here is an example:

```haskell
{-# LANGUAGE LinearTypes #-}

module Main where

import Data.Function ((&))
import qualified Prelude.Linear as L
import qualified Data.Array.Mutable.Linear as Array
import Prelude.Linear (Unrestricted(..))

processArr :: Array.Array () #-> Unrestricted ()
processArr arr =
  Array.read arr 0 L.& \(arr', Unrestricted before) ->
    Array.write arr' 0 (before `seq` ()) L.& \arr'' ->
      Array.read arr'' 0 L.& \(arr''', after) ->
          arr''' `L.lseq` after

main :: IO ()
main =
   Array.fromList [()] processArr
     & \(Unrestricted a) -> print a
```

I _think_ what happens here is that the `before` thunk (which contains a `Unsafe.readMutArr` call inside), is not evaluated until the `print` statement. However, before that, the relevant index on the array is replaced with `before \`seq\` ()` by the `write`. So, we're unknowingly tying the knot there.
@@ -64,6 +64,8 @@ group =
, testProperty "∀ a,i,x. len (write a i x) = len a" lenWrite
, testProperty "∀ a,s. len (resize s a) = s" lenResize
, testProperty "∀ a,s,x. len (resizeSeed s x a) = s" lenResizeSeed
-- unit tests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: all the tests in this file are unit tests. I never remember a name for non-property based tests, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree. What if I just say "others"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair. Though, maybe, the point is that these are regression tests? So maybe “regression tests”?

@Divesh-Otwani
Copy link
Contributor

My main concern is that we need a way of reasoning about the overall correctness of the implementation. What this bug reveals is something that's been known for a while: the order of evaluation can change the result when evaluations have side effects impacting other terms. (And when things are pure, we don't need to worry, assuming termination of course.) I want to be able to (at least in words) spell out a correctness criterion for the order of evaluation in which nothing can fail and reason that this fixes that.

My intuition is that this PR does fix all issues. Specifically, my intuition is that all we care about is that any read or write operation is atomic in changing the array or extracting a value. It seems like this PR fixes that by avoiding a read thunk. I'll think about this a bit more and post my reasoning for why this does (or does not) ensure our implementation is correct.

Copy link
Member

@aspiwack aspiwack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge because it fixes a serious issue and replaces it with a more minor one.

However @utdemir do write an issue on making arrays respect laziness. Together with an example which doesn't work: write error "I don't want to see this" into an array cell, then read it back, then ignore it. The issue is fixed when the test passes and is added to the test suite.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants