Skip to content

Conversation

@Divesh-Otwani
Copy link
Contributor

This PR flushes out the design laid out in the better-push-array branch. [In an effort to avoid plagiarism: note that all of these are @aspiwack's ideas.] We represent push arrays with this clever little rank-2 thing:

data Array a where
  Array :: (forall m. Monoid m => (a -> m) -> m) %1-> Array a

that takes a polymorphic conversion to a monoid for some element, and creates a monoid. This way it represents the monoidal concatenation of a certain natural (including zero) number of as, in some unknown order.

We instantiate this with a function that takes an a and makes it an ArrayWriter a:

data ArrayWriter a where
  ArrayWriter :: (DArray a %1-> ()) %1-> !Int -> ArrayWriter a

which holds the ingredients needed to write some number of elements, without holding the space to do so.

This is part 1/2 because I still have to add the fold functions.

transfer (Pull.Array f n) = Push.Array (\g -> DArray.fromFunction (\i -> g (f i))) n
transfer (Pull.Array f n) =
Push.Array
(\k -> Prelude.foldl (\m i -> m <> (k (f i))) mempty [0..(n-1)])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, but in my experience foldl is rarely the most efficient fold. Probably I'd go with foldl'.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed, at some point somewhere, making foldl be the strict version, and having another name for the lazy fold. Probably we should just go for it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, though, this ought to be a simpler foldMap do we have foldMap?

Copy link
Contributor Author

@Divesh-Otwani Divesh-Otwani Jan 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't use a linear foldMap without changing the type of Push and Pull to hold Int %1-> a and (a %1-> m) -> m respectively. I will usePrelude.foldMap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or did you mean a foldMap on Pull arrays? @aspiwack

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant foldMap on lists. But you are right that there is some tension in the types. Let's forget about this for now.

-- must follow so that we can release a safe API.

emptyWriter :: ArrayWriter a
emptyWriter = ArrayWriter (Unsafe.toLinear (const ())) 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsafe call here assumes that we can just forget about the DArray. I think this is the case; but it might worth a comment here. (I think every unsafe call deserves a comment unless incredibly obvious.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There shouldn't be an Unsafe.toLinear in this module. If we are missing a way to consume an empty destination array, then let's add it to the destination array module.

Copy link
Contributor Author

@Divesh-Otwani Divesh-Otwani Jan 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make a safe empty DArray consumer with HasCallStack.

(<>) x y = append x y

instance Semigroup (Array a) where
(<>) = append
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a purely stylistic choice, so feel free to ignore this; but I'd just inline the append here, I don't see a much need for a separate function and the fewer names the better.

Same goes with mempty and the variants for ArrayWriter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it because it's a tiny bit more DRY even though the chance of changing the implementation is close to zero.

transfer (Pull.Array f n) = Push.Array (\g -> DArray.fromFunction (\i -> g (f i))) n
transfer (Pull.Array f n) =
Push.Array
(\k -> Prelude.foldl (\m i -> m <> (k (f i))) mempty [0..(n-1)])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed, at some point somewhere, making foldl be the strict version, and having another name for the lazy fold. Probably we should just go for it.

transfer (Pull.Array f n) = Push.Array (\g -> DArray.fromFunction (\i -> g (f i))) n
transfer (Pull.Array f n) =
Push.Array
(\k -> Prelude.foldl (\m i -> m <> (k (f i))) mempty [0..(n-1)])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, though, this ought to be a simpler foldMap do we have foldMap?

make x n = Array (\k -> DArray.replicate (k x)) n
make :: HasCallStack => a -> Int -> Array a
make x n
| n < 0 = error "Making negative length push array"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| n < 0 = error "Making negative length push array"
| n < 0 = error "Making a negative length push array"

make :: HasCallStack => a -> Int -> Array a
make x n
| n < 0 = error "Making negative length push array"
| otherwise = Array (\makeA -> mconcat $ Prelude.replicate n (makeA x))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Base has an stimes function for this. Maybe we should have one too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Array $ \makeA -> stimes (makeA x) might be more efficient if the stimes implementation for ArrayWriter could be more efficient. However, I don't want to change Monoid in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you wish, but I think it's fine to lazily add stuff to the library (though I'm sensitive to the argument that changing a type class is not a cheap action, and may need to be done carefully).

It was not really about efficiency (though it may be), but about clarity. We can do this in a separate PR.

Comment on lines 113 to 117
-- Remark. In order for the function above to work, consume must forcibly
-- evaluate both tuples. If it was lazy, then we might not actually perform
-- @k1@ or @k2@ and the unsafe IO won't get done. In general, this makes me
-- think we haven't spelled out the careful rules of what consuming functions
-- must follow so that we can release a safe API.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but, on the other hand, you cannot not consume them (try to forget consume or something). That's what linearity does for you. The unsafe IO is (if I haven't messed up) fully encapsulated in the DArray abstraction. So types guarantee that you can't get this wrong. Therefore, this comment is superfluous

Suggested change
-- Remark. In order for the function above to work, consume must forcibly
-- evaluate both tuples. If it was lazy, then we might not actually perform
-- @k1@ or @k2@ and the unsafe IO won't get done. In general, this makes me
-- think we haven't spelled out the careful rules of what consuming functions
-- must follow so that we can release a safe API.

Copy link
Contributor Author

@Divesh-Otwani Divesh-Otwani Jan 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I see xD *facepalms himself*. There's no way to write ((),()) %1-> () without using something unsafe that doesn't force the evaluation of both tuples.

-- must follow so that we can release a safe API.

emptyWriter :: ArrayWriter a
emptyWriter = ArrayWriter (Unsafe.toLinear (const ())) 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There shouldn't be an Unsafe.toLinear in this module. If we are missing a way to consume an empty destination array, then let's add it to the destination array module.

@Divesh-Otwani Divesh-Otwani requested a review from aspiwack January 5, 2021 18:08
Comment on lines 184 to 189
dropEmpty :: HasCallStack => DArray a %1-> ()
dropEmpty = Unsafe.toLinear unsafeDrop where
unsafeDrop :: DArray a -> ()
unsafeDrop (DArray ds)
| MVector.length ds > 0 = error "Destination.dropEmpty on non-empty array."
| otherwise = ()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know what? I don't think it's safe. We at least need to seq ds like this:

Suggested change
dropEmpty :: HasCallStack => DArray a %1-> ()
dropEmpty = Unsafe.toLinear unsafeDrop where
unsafeDrop :: DArray a -> ()
unsafeDrop (DArray ds)
| MVector.length ds > 0 = error "Destination.dropEmpty on non-empty array."
| otherwise = ()
dropEmpty :: HasCallStack => DArray a %1-> ()
dropEmpty = Unsafe.toLinear unsafeDrop where
unsafeDrop :: DArray a -> ()
unsafeDrop (DArray ds)
| MVector.length ds > 0 = error "Destination.dropEmpty on non-empty array."
| otherwise = ds `seq` ()

But generally speaking, I'm wondering if we shouldn't replace newtype DArray a = DArray (MVector …) with data DArray a where { DArray :: MVector … -> DArray a }. It adds an indirection. But it will save some Unsafe.toLinear presumably. Maybe most. This may considerably reduce the kernel of trust, here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make that ^ an issue for later.

@Divesh-Otwani Divesh-Otwani mentioned this pull request Jan 6, 2021
@Divesh-Otwani Divesh-Otwani merged commit d0863c6 into master Jan 6, 2021
@Divesh-Otwani Divesh-Otwani deleted the better-push-arrays-2 branch January 6, 2021 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants