-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial functionality #1
Conversation
Allows for fixed RNG in reference tests
Codecov Report
@@ Coverage Diff @@
## main #1 +/- ##
=======================================
Coverage ? 93.61%
=======================================
Files ? 4
Lines ? 47
Branches ? 0
=======================================
Hits ? 44
Misses ? 3
Partials ? 0 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
this lowers the average quantization error
Sorry @johnnychen94, |
x ≤ 0 && return 1 / (2 * n) | ||
x ≥ 1 && return (2 * n - 1) / (2 * n) | ||
return (round(x * n - UQ_HALF) + UQ_HALF) / n | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is okay. But I get something faster by making it a plain lookup table (The key idea is to pass T,N
to the compiler so that we can do the computation in the compilation stage using generated function):
using ImageCore, BenchmarkTools
struct UniformQuantization{T, N} end
@inline function quantize(::UniformQuantization{T,N}, x::T) where {T,N}
_naive_quantize(UniformQuantization{T,N}(), x)
end
@inline function quantize(::UniformQuantization{T,N}, x::T, n) where {T<:Union{N0f8,},N}
@inbounds _build_lookup_table(UniformQuantization{T,N}())[reinterpret(FixedPointNumbers.rawtype(T), x) + 1]
end
# for every combination of {T,N}, this is only done once during the compilation
@generated function _build_lookup_table(::UniformQuantization{T,N}) where {T<:FixedPoint,N}
# This errors when T requires more than 32 bits
tmax = typemax(FixedPointNumbers.rawtype(T))
table = Vector{T}(undef, tmax + 1)
for raw_x in zero(tmax):tmax
x = reinterpret(T, raw_x)
table[raw_x + 1] = _naive_quantize(UniformQuantization{T,N}(), x)
end
return table
end
function _naive_quantize(::UniformQuantization{T,N}, x::T) where {T,N}
x ≤ 0 && return 1 / (2 * N)
x ≥ 1 && return (2 * N - 1) / (2 * N)
return (round(x * N - T(0.5)) + T(0.5)) / N
end
and I get
X = rand(N0f8, 512, 512)
Q = UniformQuantization{N0f8,32}()
@btime quantize.(Ref(Q), $X); # 95.689 μs (7 allocations: 256.24 KiB)
In the meantime, the current version is:
X = rand(N0f8, 512, 512)
Q = UniformQuantization(32)
@btime Q(X); # 5.336 ms (17 allocations: 6.01 MiB)
This trick is only applicable for fixed point numbers. Do you think it's worth the change?
For float-point numbers, a "round" process is needed to fully utilize the lookup table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now have both in the code for maximum flexibility w.r.t. input data types. However the benchmarks are a bit more complicated:
alg = UniformQuantization(32)
for T in (N0f8, Float32, RGB{N0f8}, RGB{Float32})
println("Timing $T:")
img = rand(T, 512, 512)
@btime alg($img)
@btime alg.($img) # skips `unique`
end
Timing N0f8:
25.436 ms (15 allocations: 1.50 MiB)
91.459 μs (4 allocations: 256.17 KiB)
Timing Float32:
4.659 ms (17 allocations: 6.01 MiB)
75.006 μs (3 allocations: 1.00 MiB)
Timing RGB{N0f8}:
33.910 ms (38 allocations: 1.38 MiB)
270.763 μs (3 allocations: 768.06 KiB)
Timing RGB{Float32}:
7.663 ms (39 allocations: 5.28 MiB)
344.923 μs (3 allocations: 3.00 MiB)
And building the look-up table for N0f32
freezes my Julia session.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We certainly cannot make it a generic implementation here -- too risky here whether we can delegate so much computation to compiler. We only need those really matters in practice.
For generic types, a runtime lookup table (using Dict{Type,Vector} ) can be used instead (but whether it's faster than the naive version is unclear to me)
Introduces
quantize(img, alg)
and the following algorithms:UniformQuantization
KMeansQuantization
via Clustering.jl, adapted from DitherPunk.jl's implementation