-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce interval storage size #98
Comments
I did an experiment with the current version and this new version of struct Inclusivity
val::UInt8
Inclusivity(x::UInt8) = new(x & 0b11)
end
Inclusivity(x::Integer) = Inclusivity(trunc(UInt8, x))
Inclusivity(first::Bool, last::Bool) = Inclusivity(UInt8(first) | UInt8(last) << 1)
Base.copy(inc::Inclusivity) = Inclusivity(inc.val)
Base.broadcastable(inc::Inclusivity) = Ref(inc)
function Base.convert(::Type{T}, inc::Inclusivity) where T <: Integer
return convert(T, getfield(inc, :val))
end
Base.first(inc::Inclusivity) = getfield(inc, :val) & 0b01 > 0
Base.last(inc::Inclusivity) = getfield(inc, :val) & 0b10 > 0
function Base.getproperty(inc::Inclusivity, f::Symbol)
if f === :first
first(inc)
elseif f === :last
last(inc)
else
getfield(inc, f)
end
end
isclosed(inc::Inclusivity) = x == 0b11
isopen(inc::Inclusivity) = x == 0b00 The experiment just checked the total size of the Original: julia> function demo(T)
x = Interval(T(1), T(2), true, false)
@show T sizeof(T) sizeof(x)
end
demo (generic function with 1 method)
julia> demo.(Base.BitInteger_types);
T = Int8
sizeof(T) = 1
sizeof(x) = 4
T = Int16
sizeof(T) = 2
sizeof(x) = 6
T = Int32
sizeof(T) = 4
sizeof(x) = 12
T = Int64
sizeof(T) = 8
sizeof(x) = 24
T = Int128
sizeof(T) = 16
sizeof(x) = 40
T = UInt8
sizeof(T) = 1
sizeof(x) = 4
T = UInt16
sizeof(T) = 2
sizeof(x) = 6
T = UInt32
sizeof(T) = 4
sizeof(x) = 12
T = UInt64
sizeof(T) = 8
sizeof(x) = 24
T = UInt128
sizeof(T) = 16
sizeof(x) = 40 Modified: julia> function demo(T)
x = Interval(T(1), T(2), true, false)
@show T sizeof(T) sizeof(x)
end
demo (generic function with 1 method)
julia> demo.(Base.BitInteger_types);
T = Int8
sizeof(T) = 1
sizeof(x) = 3
T = Int16
sizeof(T) = 2
sizeof(x) = 6
T = Int32
sizeof(T) = 4
sizeof(x) = 12
T = Int64
sizeof(T) = 8
sizeof(x) = 24
T = Int128
sizeof(T) = 16
sizeof(x) = 40
T = UInt8
sizeof(T) = 1
sizeof(x) = 3
T = UInt16
sizeof(T) = 2
sizeof(x) = 6
T = UInt32
sizeof(T) = 4
sizeof(x) = 12
T = UInt64
sizeof(T) = 8
sizeof(x) = 24
T = UInt128
sizeof(T) = 16
sizeof(x) = 40 It appears that due to memory alignment this size reduction doesn't help except for in the case where the interval uses endpoints of 1-byte |
The work in #89 has brought up more ideas here. If we were to also introduce a representation for unbounded intervals how does that impact memory. A quick demonstration: struct Interval1{T}
lower::T
upper::T
lower_inc::Bool
upper_inc::Bool
end
struct Interval2{T}
lower::T
upper::T
lower_inc::Bool
upper_inc::Bool
lower_bounded::Bool
upper_bounded::Bool
end
struct Interval3{T}
lower::T
upper::T
bound::UInt8
end
struct Interval4{T,L,U}
lower::T
upper::T
end We already know that due to memory alignment we can see the greatest impact with smaller julia> sizeof(Interval1{Int8}(1, 2, true, false))
4
julia> sizeof(Interval2{Int8}(1, 0, true, false, true, false))
6
julia> sizeof(Interval3{Int8}(1, 0, 0b0101))
3
julia> sizeof(Interval4{Int8,:closed,:unbounded}(1, 0))
2 Now lets inspect vectors of these eltypes. Note that all of these representations are bits types so no pointers should be present in the vectors: julia> Base.summarysize(fill(Interval1{Int8}(1, 2, true, false), 100))
440
julia> Base.summarysize(fill(Interval2{Int8}(1, 0, true, false, true, false), 100))
640
julia> Base.summarysize(fill(Interval3{Int8}(1, 0, 0b0101), 100))
340
julia> Base.summarysize(fill(Interval4{Int8,:closed,:unbounded}(1, 0), 100))
240 Each vector has 40 bytes of overhead otherwise the results are as expected. We'll also test vectors of mixed types: julia> Base.summarysize(repeat([Interval1{Int8}(1, 2, true, false), Interval1{Int8}(1, 2, false, true)], outer=50))
440
julia> Base.summarysize(repeat([Interval2{Int8}(1, 0, true, false, true, false), Interval2{Int8}(0, 1, false, true, false, true)], outer=50))
640
julia> Base.summarysize(repeat([Interval3{Int8}(1, 0, 0b0101), Interval3{Int8}(0, 1, 0b1010)], outer=50))
340
julia> Base.summarysize(repeat([Interval4{Int8,:closed,:unbounded}(1, 0), Interval4{Int8,:unbounded,:closed}(0, 1)], outer=50))
844 Using Let's look at a larger julia> sizeof(Interval1{Int64}(1, 2, true, false))
24
julia> sizeof(Interval2{Int64}(1, 0, true, false, true, false))
24
julia> sizeof(Interval3{Int64}(1, 0, 0b0101))
24
julia> sizeof(Interval4{Int64,:closed,:unbounded}(1, 0))
16 The vector results are similar but the mixed vector results are julia> Base.summarysize(repeat([Interval1{Int64}(1, 2, true, false), Interval1{Int64}(1, 2, false, true)], outer=50))
2440
julia> Base.summarysize(repeat([Interval2{Int64}(1, 0, true, false, true, false), Interval2{Int64}(0, 1, false, true, false, true)], outer=50))
2440
julia> Base.summarysize(repeat([Interval3{Int64}(1, 0, 0b0101), Interval3{Int64}(0, 1, 0b1010)], outer=50))
2440
julia> Base.summarysize(repeat([Interval4{Int64,:closed,:unbounded}(1, 0), Interval4{Int64,:unbounded,:closed}(0, 1)], outer=50))
872 Now Finally, we'll look at julia> Base.summarysize(repeat([Interval1{Int128}(1, 2, true, false), Interval1{Int128}(1, 2, false, true)], outer=50))
4040
julia> Base.summarysize(repeat([Interval2{Int128}(1, 0, true, false, true, false), Interval2{Int128}(0, 1, false, true, false, true)], outer=50))
4040
julia> Base.summarysize(repeat([Interval3{Int128}(1, 0, 0b0101), Interval3{Int128}(0, 1, 0b1010)], outer=50))
4040
julia> Base.summarysize(repeat([Interval4{Int128,:closed,:unbounded}(1, 0), Interval4{Int128,:unbounded,:closed}(0, 1)], outer=50))
904 I would say that using |
This is an idea I've had for a while on but haven't investigated. Currently
Inclusivity
is defined as:This structure uses 2-bytes to store these two booleans but we can theoretically reduce the size down to 2-bits. Unfortunately Julia only allows structures as small as 8-bits/1-byte but this is still an improvement. Potentially this could save a bunch of memory when it comes to storing arrays of
Interval
s.The text was updated successfully, but these errors were encountered: