-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC/Coffee Break] What about a new type NormedFloat
?
#147
Comments
Define the addition and subtraction of two Base.:+(x::N0f8, y::N0f8) = N0f8(x.i + y.i, 0)
Base.:-(x::N0f8, y::N0f8) = N0f8(x.i - y.i, 0)
# or
Base.:+(x::N0f8, y::N0f8) = NF16f8(Float16(Int16(x.i) + Int16(y.i)), 0)
Base.:-(x::N0f8, y::N0f8) = NF16f8(Float16(Int16(x.i) - Int16(y.i)), 0)
# or
Base.:+(x::N0f8, y::N0f8) = S7f8(Int16(x.i) + Int16(y.i), 0)
Base.:-(x::N0f8, y::N0f8) = S7f8(Int16(x.i) - Int16(y.i), 0) Let's go! using BenchmarkTools
A = collect(rand(N0f8, 256, 256));
B = collect(rand(N0f8, 256, 256));
view_A = view(A,:,:);
view_B = view(B,:,:); julia> @benchmark view_A .+ view_B # N0f8
BenchmarkTools.Trial:
memory estimate: 64.19 KiB
allocs estimate: 4
--------------
minimum time: 4.117 μs (0.00% GC)
median time: 5.900 μs (0.00% GC)
mean time: 13.413 μs (13.47% GC)
maximum time: 643.033 μs (98.04% GC)
--------------
samples: 10000
evals/sample: 6
julia> @benchmark view_A .- view_B # N0f8
BenchmarkTools.Trial:
memory estimate: 64.19 KiB
allocs estimate: 4
--------------
minimum time: 4.383 μs (0.00% GC)
median time: 6.200 μs (0.00% GC)
mean time: 10.831 μs (11.66% GC)
maximum time: 954.900 μs (97.91% GC)
--------------
samples: 10000
evals/sample: 6
julia> @benchmark view_A .+ view_B # NF16f8
BenchmarkTools.Trial:
memory estimate: 128.13 KiB
allocs estimate: 4
--------------
minimum time: 140.399 μs (0.00% GC)
median time: 143.599 μs (0.00% GC)
mean time: 150.347 μs (1.65% GC)
maximum time: 2.787 ms (93.76% GC)
--------------
samples: 10000
evals/sample: 1
julia> @benchmark view_A .- view_B # NF16f8
BenchmarkTools.Trial:
memory estimate: 128.13 KiB
allocs estimate: 4
--------------
minimum time: 140.199 μs (0.00% GC)
median time: 143.199 μs (0.00% GC)
mean time: 147.756 μs (1.39% GC)
maximum time: 2.492 ms (92.98% GC)
--------------
samples: 10000
evals/sample: 1
julia> @benchmark view_A .+ view_B # S7f8
BenchmarkTools.Trial:
memory estimate: 128.13 KiB
allocs estimate: 4
--------------
minimum time: 8.100 μs (0.00% GC)
median time: 13.700 μs (0.00% GC)
mean time: 18.973 μs (9.08% GC)
maximum time: 1.940 ms (97.62% GC)
--------------
samples: 10000
evals/sample: 1
julia> @benchmark view_A .- view_B # S7f8
BenchmarkTools.Trial:
memory estimate: 128.13 KiB
allocs estimate: 4
--------------
minimum time: 7.900 μs (0.00% GC)
median time: 14.300 μs (0.00% GC)
mean time: 17.624 μs (11.29% GC)
maximum time: 2.190 ms (97.86% GC)
--------------
samples: 10000
evals/sample: 1
Excellent!! 😂 |
In summary, it's a normed float-point number that behaves like a fixed-point number, and since it's a float number, it doesn't have issues of integers. If I understand it right, if we take numbers as N-bit 01 sequences, it comes to a point that @timholy tends to directly handle this sequence, while @kimikage wants to take advantage of the existing float number system. My feelings (without justifications) tell me that Tim's method would be more maintainable by eliminating unnecessary abstractions, though it requires more effort to make it work properly. |
Yes. Regardless of the method, changing the arithmetic requires much effort to make it work properly. |
Since OK. Here is a speed demon. 😈 Remember why we are saying function Base.:+(x::N0f8, y::N0f8)
zi16 = Int16(x.i) + Int16(y.i)
zi32 = reinterpret(Int32, Float32(zi16) * 1.92593f-34)
zf16 = unsafe_trunc(UInt16, zi32 >> 0xD)
NF16f8(reinterpret(Float16, zf16), 0)
end
function Base.:-(x::N0f8, y::N0f8)
xf32p1, yf32p1 = reinterpret.(Float32, 0x53000000 .+ (x.i, y.i))
zi32 = reinterpret(Int32, xf32p1 - yf32p1)
zf16 = unsafe_trunc(UInt16, zi32 >> 0xD | ((zi32 >> 0x10) & 0x8000))
NF16f8(reinterpret(Float16, zf16), 0)
end julia> @benchmark view_A .+ view_B # diabolical NF16f8
BenchmarkTools.Trial:
memory estimate: 128.13 KiB
allocs estimate: 4
--------------
minimum time: 11.700 μs (0.00% GC)
median time: 15.199 μs (0.00% GC)
mean time: 18.400 μs (10.13% GC)
maximum time: 1.956 ms (97.30% GC)
--------------
samples: 10000
evals/sample: 1
julia> @benchmark view_A .- view_B # diabolical NF16f8
BenchmarkTools.Trial:
memory estimate: 128.13 KiB
allocs estimate: 4
--------------
minimum time: 11.900 μs (0.00% GC)
median time: 15.201 μs (0.00% GC)
mean time: 18.009 μs (8.97% GC)
maximum time: 1.882 ms (97.11% GC)
--------------
samples: 10000
evals/sample: 1 😈 |
MRI benchmark (#143 (comment))julia> @btime view(mri, :,:,1:26) - view(mri, :,:,2:27); # N0f8
490.800 μs (36 allocations: 1.04 MiB)
julia> @btime view(mri, :,:,1:26) - view(mri, :,:,2:27); # NF16f8
718.100 μs (36 allocations: 2.09 MiB)
julia> @btime view(mri, :,:,1:26) - view(mri, :,:,2:27); # S7f8
671.701 μs (36 allocations: 2.09 MiB) In terms of the minimum time,
Originally posted by @kimikage in #143 (comment) To be honest, the |
The accumulation shows the true worth of Base.:(+)(x::NF32f8, y::N0f8) = NF32f8(x.i + y.i, 0)
Base.:(+)(x::S55f8, y::N0f8) = S55f8(x.i + y.i, 0)
Base.:(+)(x::S23f8, y::N0f8) = S23f8(x.i + y.i, 0) julia> @btime mysum(0.0f0, view_A)
4.071 μs (1 allocation: 16 bytes)
32765.545f0
julia> @btime mysum(reinterpret(NF32f8, 0.0f0), view_A) |> Float32
2.378 μs (3 allocations: 48 bytes)
32765.541f0
julia> @btime mysum(reinterpret(S55f8, 0), view_A) |> Float32
4.043 μs (3 allocations: 48 bytes)
32765.541f0
julia> @btime mysum(reinterpret(S23f8, Int32(0)), view_A) |> Float32
2.122 μs (3 allocations: 48 bytes)
32765.541f0 Since Julia's SIMD optimizer still has room for improvement, this result is almost simply a matter of how many numbers are in a SIMD register. It's not surprising. |
I like the speed-demon version. But If what you're really after is detecting overflow and invalidating the result: we'd get the same result if we created a new type of 16-bit integer that that reserves the top bit as a sign bit and the 2nd-from-top as an julia> a = rand(Float16, 100);
julia> b = Float32.(a);
julia> @btime $a.*$a;
793.337 ns (1 allocation: 288 bytes)
julia> @btime $b.*$b;
37.396 ns (1 allocation: 496 bytes) |
Thank you for your comment. I'm glad to know a different way of thinking.
You are right. However, I don't think we should ensure First, as you know, it is almost impossible to provide "completely safe" types with Julia's type system, i.e. without evaluating the values.
Originally posted by @timholy in #41 (comment) This means that Secondly, Some users may want to use
I also think the exponent is little useful. However, it will not be necessary to discard the exponent and introduce a new non-standard format. I think the multiplication of BTW, the multiplication of the current |
Today's coffee break ☕ function Base.:*(x::NF16f8, y::NF16f8)
xu16 = reinterpret(UInt16, x.i)
xu32 = UInt32(xu16 << 1) << 0xC
xf32 = reinterpret(Float32, xu32) * (5.194832f33 / 255.0f0) # not tuned
yu16 = reinterpret(UInt16, y.i)
yu32 = UInt32(yu16 << 1) << 0xC
zf32 = xf32 * reinterpret(Float32, yu32)
zs = (xu16 ⊻ yu16) & 0x8000
zu32 = reinterpret(UInt32, zf32) >> 0xD
zu16 = unsafe_trunc(UInt16, zu32)
NF16f8(reinterpret(Float16, min(0x7C00, zu16) | zs), 0)
end julia> a = rand(Float16, 100);
julia> @btime $a.*$a;
745.169 ns (1 allocation: 288 bytes)
julia> nf = reinterpret.(NF16f8, a .* 255);
julia> @btime $nf.*$nf;
53.651 ns (1 allocation: 288 bytes) 😈 |
I wrote this article not to confuse the discussion, but to give a broader perspective, i.e. to remove the constraints coming from "belief". |
I'm sure
Normed{<:Signed}
solves theN0f8
"overflow" problem, but I doubt that it is the best solution. So, I proposed a new strange typeNormedFloat
as a spoiling candidate or a touchstone.NormedFloat
is a kind of joke. So, don't think this too seriously. ☕However, I think that the concepts and techniques related to
NormedFloat
might be helpful for other developments.NormedFloat
can be defined as:<: FixedPoint{DummyInt{T},f}
is just a workaround to useColorVectorSpace.jl
without modifications. As the name suggests,NormedFloat
is not aFixedPoint
numbers.And, the signed
Normed
can be defined as:I don't use
Normed{<:Signed}
here to make it easier to experiment on local REPL. If you already haveNormed{<:Signed}
, you can use it.Just for display (not optimized):
Now, the following are the examples of numbers:
The text was updated successfully, but these errors were encountered: