Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ST instead of polymorphic m in Vector/MVector type classes #335

Merged
merged 4 commits into from Jan 16, 2021

Conversation

Shimuuar
Copy link
Contributor

This is implementation of idea from #315. Instead of using polymorphic m in methods of MVector/Vector type class use ST and lift it to PrimMonad at use sites. It's probably easier to illustrate idea with few diffs:

-  basicUnsafeNew   :: PrimMonad m => Int -> m (v (PrimState m) a)
+  basicUnsafeNew   :: Int -> ST s (v s a)
 unsafeRead v i = UNSAFE_CHECK(checkIndex) "unsafeRead" i (length v)
+               $ stToPrim
                $ basicUnsafeRead v i

Main motivation for change is to allow GND for those type classes. It's partial success. In following code snippet:

newtype instance MVector s (Sum a) = MV_Sum (MVector s a)
newtype instance Vector    (Sum a) = V_Sum  (Vector a)
deriving instance Unbox a => G.Vector  Vector  (Sum a)
deriving instance Unbox a => M.MVector MVector (Sum a)

MVector instance derived successfully. Vector fails with:

    • Couldn't match representation of type ‘m1 a’
                               with that of ‘m1 (Sum a)’

because of basicUnsafeIndexM :: Monad m => v a -> Int -> m a. GHC cannot coerce between m a and m (Sum a). I'm a bit at loss here.

@Shimuuar Shimuuar marked this pull request as draft October 19, 2020 17:56
@Shimuuar
Copy link
Contributor Author

I said that I don't know what to do and then I used same trick: use box monad in order to express strictness in indexing and keep indexed value unevaluated. Everything works except GHC8.0 can't derive instances for some reason. While this Box trick seems to work I didn't test whether it really works.

I also want to test whether this approach works with DerivingVia. GND is surely nice but I'm not convinced that it alone make this change valuable enough.

@Shimuuar Shimuuar force-pushed the use-ST branch 2 times, most recently from 9c22002 to 5b5c0aa Compare January 8, 2021 12:09
@Shimuuar Shimuuar marked this pull request as ready for review January 8, 2021 12:14
@Shimuuar
Copy link
Contributor Author

Shimuuar commented Jan 8, 2021

In the end idea of using concrete monads in the definition of vector's type classes worked out nicely. basicWhatever methods which used PrimMonad m now are defined in terms of ST, and basicIndexM is defined in terms of Box. Rest of API is not changed at all.

  1. GND works. It's now possible to define unbox instances for all newtypes using it. (Except GHC8.0 fails with weird error, and we'll have to wait until we drop support for it)

  2. DerivingVia works too. As an example:

newtype instance MVector s Int = MV_Int (P.MVector s Int)
newtype instance Vector    Int = V_Int  (P.Vector    Int)
instance Unbox Int
deriving via (UnboxViaPrim Int) instance M.MVector MVector Int
deriving via (UnboxViaPrim Int) instance G.Vector Vector Int

3, Box tricked for indexM works too:

*Data.Vector> _ <- indexM (Data.Vector.fromList [undefined]) 0
*Data.Vector> _ <- indexM undefined 0
*** Exception: Prelude.undefined
CallStack (from HasCallStack):
  error, called at libraries/base/GHC/Err.hs:80:14 in base:GHC.Err
  undefined, called at <interactive>:5:13 in interactive:Ghci1
  1. As small test for breakage I built statistics with this change. It and its dependencies has both TH and handwritten instances. All builds without any problems

Overall I think this is big improvement for vector usability.

@Shimuuar Shimuuar changed the title WIP: Use ST instead of polymorphic m in Vector/MVector type classes Use ST instead of polymorphic m in Vector/MVector type classes Jan 8, 2021
@lehins
Copy link
Contributor

lehins commented Jan 8, 2021

@Shimuuar This is great news! There is only one concern I have about GND and unboxing is that deriving does not add INLINE pragmas and I have seen this happen before where GHD causes derived class functions from not being inlined. (Here is a real life example where I saw this being a problem: commercialhaskell/rio#166) If the same happens with deriving unboxed vector it can have some serious performance implications. I think it would be great to have some benchmarks comparison of at least few functions of derived vs underived approach. Have you tried checking performance or differences in compiled core?

Other than the above concern I think this is a great improvement. I'll do a proper PR review either today or the latest tomorrow, just in case if there is anything I can spot.

@Shimuuar
Copy link
Contributor Author

Shimuuar commented Jan 8, 2021

Good point! I would've expect that GHC inlines coerced methods eagerly. I'll look into it on the weeekends

@Shimuuar
Copy link
Contributor Author

I've converted all existing benchmarks which use Double to CDouble. There's no measurable difference:

Double:

quickhull                                mean 85.75 ms  ( +- 994.5 μs  )
spectral                                 mean 3.183 ms  ( +- 35.50 μs  )
tridiag                                  mean 99.24 ms  ( +- 1.458 ms  )
findIndexR                               mean 189.7 μs  ( +- 2.424 μs  )
findIndexR_naïve                         mean 109.1 ms  ( +- 3.609 ms  )
findIndexR_manual                        mean 352.2 μs  ( +- 10.20 μs  )

CDouble:

quickhull                                mean 85.88 ms  ( +- 967.8 μs  )
spectral                                 mean 3.180 ms  ( +- 92.69 μs  )
tridiag                                  mean 99.59 ms  ( +- 2.563 ms  )
findIndexR                               mean 194.0 μs  ( +- 7.610 μs  )
findIndexR_naïve                         mean 109.8 ms  ( +- 4.609 ms  )
findIndexR_manual                        mean 349.3 μs  ( +- 5.304 μs  )

P.S. We don't have instances for types from Foreight.C.Types. Looks like oversight for me

Copy link
Contributor

@lehins lehins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Shimuuar Besides few minor suggestions this PR looks great. Do you mind adding an entry to change log that mentions this breaking change

Data/Vector/Unboxed/Base.hs Outdated Show resolved Hide resolved
Data/Vector/Unboxed/Base.hs Show resolved Hide resolved
Data/Vector/Unboxed/Base.hs Outdated Show resolved Hide resolved
@@ -165,6 +165,90 @@ instance G.Vector Vector () where
-- Primitive types
-- ---------------

-- | Newtype wrapper which allows to derive unboxed vector in term of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be great to also include GND approach here as well as the third alternative that doesn't rely on via. This PR makes it possible, right? I was under impression that it would be possible to derive Unbox with this approach for things like newtype RGB = RGB (Double, Double, Double). If deriving using an already existing Unbox instance is still not possible, then feel free ignore this comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding GND example is very good idea but it's probably better to add it in documentation o Unboxed module. Add section about deriving of unboxed vector.

Data/Vector/Generic/Base.hs Outdated Show resolved Hide resolved
@lehins
Copy link
Contributor

lehins commented Jan 10, 2021

With regards to failing CI. I've seen doctests consistently fail on other projects when executed with ghc-8.10. I have not investigated the issue what is the cause of it, since doctests are mainly intended to check that examples are still correct. So we might wanna turn off doctests for ghc-8.10 until this issue resolves itself with future ghc+doctests.

@Bodigrim
Copy link
Contributor

So we might wanna turn off doctests for ghc-8.10 until this issue resolves itself with future ghc+doctests.

Let's disable them for now, we've seen the same issue with random some time ago.

@Shimuuar
Copy link
Contributor Author

I've fixed everything except documentation on using GND for unboxed vectors.Will write it tomorrow

@Shimuuar Shimuuar force-pushed the use-ST branch 3 times, most recently from 050b5e5 to 0ae7fc2 Compare January 13, 2021 18:49
changelog.md Show resolved Hide resolved
@Shimuuar
Copy link
Contributor Author

I've added note about GND. I didn't change existing documentation.But it seems to be somewhat misleading:

Implementing unboxed vectors for new data types can be very easy.

In particular this line seems to be out of place. BTW approach outlined there (and it's one used by vector-th-unbox) could be turned into deriving via with approach a la https://hackage.haskell.org/package/iso-deriving It's probably should be done in separate PR

@lehins
Copy link
Contributor

lehins commented Jan 16, 2021

@Shimuuar Do you mind rebasing on master and squashing all the commits? Other than that I think we can merge it in

changelog.md Outdated Show resolved Hide resolved
End goal is to make it possible to derive Vector/MVector type classes using
GND/DerivingVia mechanism. Currently it; impossible since GHC can't coerce
between `m a` and `m (Newtype a)` where m is a type variable

This could be worked around by switching from `forall m. PrimMonad m => .. ->
m()` to working in the ST monad which GHC could coerce.

Unfortunately Vector type class contains method

>   basicUnsafeIndexM  :: Monad m => v a -> Int -> m a

which suffers from same problem
But instead of using ST we use Box monad which allows to force act of indexing
without forcing value itself

Now everything compiles
It serves mostly as example for deriving via. Once support for GHC<8.6 is
dropped it would be possible to replace CPP for primitive types with deriving
via
@Shimuuar
Copy link
Contributor Author

@lehins done

@Bodigrim thanks! I've updated changelog

@Bodigrim Bodigrim merged commit 0ae6b2f into haskell:master Jan 16, 2021
@Bodigrim
Copy link
Contributor

Awesome stuff!

@Shimuuar Shimuuar deleted the use-ST branch March 4, 2021 15:55
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Aug 19, 2022
# Changes in version 0.13.0.0

 * `mkType` from `Data.Vector.Generic` is deprecated in favor of
   `Data.Data.mkNoRepType`
 * The role signatures on several `Vector` types were too permissive, so they
   have been tightened up:
   * The role signature for `Data.Vector.Mutable.MVector` is now
     `type role MVector nominal representational` (previously, both arguments
     were `phantom`). [#224](haskell/vector#224)
   * The role signature for `Data.Vector.Primitive.Vector` is now
     `type role Vector nominal` (previously, it was `phantom`).
     The role signature for `Data.Vector.Primitive.Mutable.MVector` is now
     `type role MVector nominal nominal` (previously, both arguments were
     `phantom`). [#316](haskell/vector#316)
   * The role signature for `Data.Vector.Storable.Vector` is now
     `type role Vector nominal` (previous, it was `phantom`), and the signature
     for `Data.Vector.Storable.Mutable.MVector` is now
     `type role MVector nominal nominal` (previous, both arguments were
     `phantom`). [#235](haskell/vector#235)

     We pick `nominal` for the role of the last argument instead of
     `representational` since the internal structure of a `Storable` vector is
     determined by the `Storable` instance of the element type, and it is not
     guaranteed that the `Storable` instances between two representationally
     equal types will preserve this internal structure.  One consequence of this
     choice is that it is no longer possible to `coerce` between
     `Storable.Vector a` and `Storable.Vector b` if `a` and `b` are nominally
     distinct but representationally equal types. We now provide
     `unsafeCoerce{M}Vector` and `unsafeCast` functions to allow this (the onus
     is on the user to ensure that no `Storable` invariants are broken when
     using these functions).
 * Methods of type classes `Data.Vector.Generic.Mutable.MVector` and
   `Data.Vector.Generic.Vector` use concrete monads (`ST`, etc) istead of being
   polymorphic (`PrimMonad`, etc). [#335](haskell/vector#335).
   This makes it possible to derive `Unbox` with:
   * `GeneralizedNewtypeDeriving`
   * via `UnboxViaPrim` and `Prim` instance
   * via `As` and `IsoUnbox` instance: [#378](haskell/vector#378)
 * Add `MonadFix` instance for boxed vectors: [#312](haskell/vector#312)
 * Re-export `PrimMonad` and `RealWorld` from mutable vectors:
   [#320](haskell/vector#320)
 * Add `maximumOn` and `minimumOn`: [#356](haskell/vector#356)
 * The functions `scanl1`, `scanl1'`, `scanr1`, and `scanr1'` for immutable
   vectors are now defined when given empty vectors as arguments,
   in which case they return empty vectors. This new behavior is consistent
   with the one of the corresponding functions in `Data.List`.
   Prior to this change, applying an empty vector to any of those functions
   resulted in an error. This change was introduced in:
   [#382](haskell/vector#382)
 * Change allocation strategy for `unfoldrN`: [#387](haskell/vector#387)
 * Remove `CPP` driven error reporting in favor of `HasCallStack`:
   [#397](haskell/vector#397)
 * Remove redundant `Storable` constraints on to/from `ForeignPtr` conversions:
   [#394](haskell/vector#394)
 * Add `unsafeCast` to `Primitive` vectors: [#401](haskell/vector#401)
 * Make `(!?)` operator strict: [#402](haskell/vector#402)
 * Add `readMaybe`: [#425](haskell/vector#425)
 * Add `groupBy` and `group` for `Data.Vector.Generic` and the specialized
   version in `Data.Vector`, `Data.Vector.Unboxed`, `Data.Vector.Storable` and
   `Data.Vector.Primitive`. [#427](haskell/vector#427)
 * Add `toArraySlice` and `unsafeFromArraySlice` functions for conversion to and
   from the underlying boxed `Array`: [#434](haskell/vector#434)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants