Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial functionality #1

Merged
merged 15 commits into from
Oct 3, 2022
Merged

Initial functionality #1

merged 15 commits into from
Oct 3, 2022

Conversation

adrhill
Copy link
Collaborator

@adrhill adrhill commented Sep 26, 2022

Introduces quantize(img, alg) and the following algorithms:

  • UniformQuantization
  • KMeansQuantization via Clustering.jl, adapted from DitherPunk.jl's implementation

@adrhill
Copy link
Collaborator Author

adrhill commented Sep 26, 2022

Codecov Report

❗ No coverage uploaded for pull request base (main@f0be8f8). Click here to learn what that means.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main       #1   +/-   ##
=======================================
  Coverage        ?   93.61%           
=======================================
  Files           ?        4           
  Lines           ?       47           
  Branches        ?        0           
=======================================
  Hits            ?       44           
  Misses          ?        3           
  Partials        ?        0           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@adrhill
Copy link
Collaborator Author

adrhill commented Sep 28, 2022

Sorry @johnnychen94, src/uniform.jl could now use another quick look. The implementation should now be equal to MATLAB's, rounding to the center of each cube in the grid:

x ≤ 0 && return 1 / (2 * n)
x ≥ 1 && return (2 * n - 1) / (2 * n)
return (round(x * n - UQ_HALF) + UQ_HALF) / n
end
Copy link
Member

@johnnychen94 johnnychen94 Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is okay. But I get something faster by making it a plain lookup table (The key idea is to pass T,N to the compiler so that we can do the computation in the compilation stage using generated function):

using ImageCore, BenchmarkTools

struct UniformQuantization{T, N} end

@inline function quantize(::UniformQuantization{T,N}, x::T) where {T,N}
    _naive_quantize(UniformQuantization{T,N}(), x)
end

@inline function quantize(::UniformQuantization{T,N}, x::T, n) where {T<:Union{N0f8,},N}
    @inbounds _build_lookup_table(UniformQuantization{T,N}())[reinterpret(FixedPointNumbers.rawtype(T), x) + 1]
end

# for every combination of {T,N}, this is only done once during the compilation
@generated function _build_lookup_table(::UniformQuantization{T,N}) where {T<:FixedPoint,N}
    # This errors when T requires more than 32 bits
    tmax = typemax(FixedPointNumbers.rawtype(T))
    table = Vector{T}(undef, tmax + 1)
    for raw_x in zero(tmax):tmax
        x = reinterpret(T, raw_x)
        table[raw_x + 1] = _naive_quantize(UniformQuantization{T,N}(), x)
    end
    return table
end

function _naive_quantize(::UniformQuantization{T,N}, x::T) where {T,N}
    x  0 && return 1 / (2 * N)
    x  1 && return (2 * N - 1) / (2 * N)
    return (round(x * N - T(0.5)) + T(0.5)) / N
end

and I get

X = rand(N0f8, 512, 512)
Q = UniformQuantization{N0f8,32}()
@btime quantize.(Ref(Q), $X); #   95.689 μs (7 allocations: 256.24 KiB)

In the meantime, the current version is:

X = rand(N0f8, 512, 512)
Q = UniformQuantization(32)
@btime Q(X); #   5.336 ms (17 allocations: 6.01 MiB)

This trick is only applicable for fixed point numbers. Do you think it's worth the change?
For float-point numbers, a "round" process is needed to fully utilize the lookup table.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now have both in the code for maximum flexibility w.r.t. input data types. However the benchmarks are a bit more complicated:

alg = UniformQuantization(32)
for T in (N0f8, Float32, RGB{N0f8}, RGB{Float32})
    println("Timing $T:")
    img = rand(T, 512, 512)
    @btime alg($img)
    @btime alg.($img) # skips `unique`
end
Timing N0f8:
  25.436 ms (15 allocations: 1.50 MiB)
  91.459 μs (4 allocations: 256.17 KiB)
Timing Float32:
  4.659 ms (17 allocations: 6.01 MiB)
  75.006 μs (3 allocations: 1.00 MiB)
Timing RGB{N0f8}:
  33.910 ms (38 allocations: 1.38 MiB)
  270.763 μs (3 allocations: 768.06 KiB)
Timing RGB{Float32}:
  7.663 ms (39 allocations: 5.28 MiB)
  344.923 μs (3 allocations: 3.00 MiB)

And building the look-up table for N0f32 freezes my Julia session.

Copy link
Member

@johnnychen94 johnnychen94 Oct 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We certainly cannot make it a generic implementation here -- too risky here whether we can delegate so much computation to compiler. We only need those really matters in practice.
For generic types, a runtime lookup table (using Dict{Type,Vector} ) can be used instead (but whether it's faster than the naive version is unclear to me)

@adrhill adrhill merged commit c3fe280 into main Oct 3, 2022
@adrhill adrhill deleted the ah/initial branch October 3, 2022 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants