-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BitArray performance enhancements via introduction of UnsafeArray and UnsafeBitArray #3265
Conversation
This intruduces UnsafeArray and UnsafeBitArrays and uses them throughout bitarray.jl.
This is pretty interesting. I like the approach of using a differently type object accessing the same memory approach (this seems to be emerging as a very Julian pattern), although it feels a bit odd to wrap the safe version to get the unsafe version – it feels like it should be the other way. |
Yes, wrapper is not really the term here. It's more like the opposite, in that these things extract some minimal amount of information discarding everything else. Maybe a better description would be "unsafe array accessor", or "unsafe array view"? |
This is good work, but I would not merge this. People will get in the habit of using UnsafeArray "for performance", and soon we will have code crashing (or worse, overwriting random data) all over the place. We must not give up safety and elegance. I notice that some of the changes (e.g. in I should really get around to adding the As I said in #3224, manually keeping an object alive is much harder to pull off than a local Another good alternative is to use vector performance kernels like |
Yes, I see the danger (even though these types are of course not exported), and I agree that the need to keep the array alive is the biggest issue. One possibility to mitigate the problem of people inadvertently using unsafe arrays and passing them around would be not making them be AbstractArrays (the code in the PR only misses (also: I thought about using Array's |
I really think we should avoid unsafe or very messy code. To me those are not worth 2x, maybe even 4x performance. These things should be solved in the compiler. Another tricky thing I'm not yet doing in the compiler is hoisting access to the data pointer of Vectors, since they can grow and therefore move at almost any point. This also makes it hard to do the optimization correctly manually using |
Even though I am not convinced myself that this scheme is fine, I'll still try to further expand of the previous argument, for the sake of discussion: it might be argued that if we don't have
It's basically just a matter of substituting (That said, it's totally fine with me if this does not get in.) |
No, |
Changes Unknown when pulling 90b1a46 on carlobaldassi:bitperf2 into * on JuliaLang:master*. |
wtf, mystery coveralls again? looks like some repo with a bunch of php - they definitely have a bug somewhere we should probably report to someone |
(See also #2360)
This is possibly controversial, hence the pull request.
First, see the performance comparisons of affected functions:
https://gist.github.com/carlobaldassi/5691454
Some functions become 10 to 20 times faster; for most the boost is about 1.5x / 2x.
Basically, I introduced two new immutable types,
UnsafeArray
andUnsafeBitArray
, for internal use, and used them throughoutbitarray.jl
. They actually have the semantics of (and inherit from) anAbstractVector
, but I see them as wrappers around arrays. The code forUnsafeArray
is a few lines of code inarray.jl
, and could be used there with analogous performance boosts, but I did not touchArray
functions in this commit.What I like here is that if you know what you're doing the code mostly stays clean and readable, basically hiding away stuff like
getindex_unchecked(B.chunks, i)
andunsafe_store!(pointer(A), x, i)
and getting back standard notation asB[i]
andA[i] = x
.(This was always the case, with the exception of the
find
function, for which the pointer notation still gives a better performance, at least in my tests.)Thoughts?