-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add strict vector #488
Add strict vector #488
Conversation
This is essentially a copy of lazy boxed vector with - MVector.basicUnsafeWrite - MVector.basicUnsafeReplicate - Vector.elemseq are made strict Tests are copy of lazy vector's tests as well
fromArray reduces all elements of array to WHNF but that changes complexity from O(1) to O(n). To compensate we provide lazy variant which pass all responsibility to user
Strict vectors are generally better default
At this point PR is pretty much ready |
@infinity0 could you possibly review? |
Basically it's same as version from I also switched examples in D.V.Generic to strict vectors but that's purely documentation change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!
vector/src/Data/Vector/Strict.hs
Outdated
|
||
|
||
-- | Boxed vectors, supporting efficient slicing. | ||
data Vector a = Vector {-# UNPACK #-} !Int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it be a newtype over lazy boxed vectors? I imagine it would save us some instance definitions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should be possible. Probably most importantly for cheap conversion between strict and lazy vectors.
But we should be careful with instances. Anything that produces vectors should be implemented in terms of operations over strict vectors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth to employ newtypes if possible, to demonstrate clearly that it is the same data structure just with different instances.
type role MVector nominal representational | ||
|
||
-- | Mutable boxed vectors keyed on the monad they live in ('IO' or @'ST' s@). | ||
data MVector s a = MVector { _offset :: {-# UNPACK #-} !Int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question: could this be a newtype over lazy boxed MVector
?
Co-authored-by: ˌbodʲɪˈɡrʲim <andrew.lelechenko@gmail.com>
This is done over @Bodigrim's suggestion. This allowed to drop quite bit of code
However we can't derive Eq & Ord since DerivingStrategies are only available since 8.2
It seems cabal decided on plan with deepseq<1.4.3
@Bodigrim switching to a newtype worked out quite well and allowed to drop a lot of duplicated code. It doesn't work all that well for instances since DerivingStrategies are only available since 8.2. So it's not possible to instruct GHC to derive say I also added function for conversion between strict and lazy vectors. |
vector/src/Data/Vector/Strict.hs
Outdated
-- of holding any Haskell value). It is possible to create vector | ||
-- which contain bottom elements, either by using mutable interfaces | ||
-- (see "Data.Vector.Strict.Mutable") or functions that don't preserve | ||
-- strictness ('lazyFromArray'). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the use case for this? Just seems like a potential source of bugs and/or making existing bugs harder to find. If somebody sees this strict type they are going to assume it doesn't contain bottom elements, but this design seems to suggest otherwise.
IMO a type that is advertised as "strict" should have no possible external (user) code path that can lead it to containing non-WHNF elements, even if it's internally implemented using lazy structures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they retain O(1) complexity but user has to deal with strictness.
I don't think this is a good use-case, if the user selects a strict data type they should be aware that there will be potential performance penalties for certain operations, we shouldn't expose a non-strict operation for them to "regain performance", they should instead just pick the lazy type.
edit: If a strict struct has to be passed around a bunch of different libraries, any one of them could make a mistake making this trade-off decision, and it will be a massive pain to track down the culprit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On one hand it's one of many unsafe functions where all responsibility on maintaining invariants is passed to the caller. So it seems that it would be in a good company.
However here invariant is "all elements are in WHNF" And it's known to be rather tricky to maintain. And even worse it's difficult to check and invariant violations could be invisible when all test look fine and then it fail on prod due to space leak.
So I think it would be better to drop these functions and if someone really need these it could be discussed.
These functions require user to maintain invariant that all element of vector is in WHNF. And it's well known that this insvariant is difficult to enforce and its violations are usually invisible. Such spec leaks may be invisibe during testing only to cause pproblem in production. So it's better to remove them unless good use case for them appears
I removed function that don't enforce strictness: |
vector/src/Data/Vector/Strict.hs
Outdated
@@ -23,11 +23,9 @@ | |||
-- Immutable strict boxed vectors (that is, polymorphic arrays capable | |||
-- of holding any Haskell value). It is possible to create vector | |||
-- which contain bottom elements, either by using mutable interfaces |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can drop "either" here now, also might be good to clarify the vectors that might contain bottom elements are the original data type, not this strict type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reworded module comment in order to make it more clear
I'm going to merge this PR on next weekend if there are no objections |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I skimmed through the PR. Looks like it is just more of the same as regular lazy vector. I looked at all the entry points where lazy values could enter into the strict vector and it looks like they are all covered.
So, no objections from me.
Fixes #483 and #380
Implementation follows discussion in issues above and virtually identical to one in
strict-containers
. Only differencefromArray
andunsafeFromArraySlice
now force all elements of returned vector to WHNF. It changes complexity to O(n) so to compensatelazyFromArray
andunsafeLazyFromArraySlice
are added, they retain O(1) complexity but user has to deal with strictness.This is draft since there's still some work to do with documentation.
V.
is used by lazy vector,VS
by storable. One option@since 0.13.2.0
on each function inStrict
modulesAlso