Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting standard deviation of img (std) #187

Closed
philtomson opened this issue Sep 20, 2014 · 10 comments
Closed

Getting standard deviation of img (std) #187

philtomson opened this issue Sep 20, 2014 · 10 comments

Comments

@philtomson
Copy link

I'm a newbie to both julia and the Images package so perhaps this is a silly question:

I notice that I can get the mean of an img, via:

mean(img)

(in fact, in IJulia this displays that mean color)
But getting a standard deviation using std(img) gives:

`varm` has no method matching    varm(::Image{RGB{UfixedBase{Uint8,8}},2,Array{RGB{UfixedBase{Uint8,8}},2}}, ::RGB{Float32})

while loading In[66], in expression starting on line 1
in var at statistics.jl:162

I also tried: std(data(img))

...but got the same result.

@timholy
Copy link
Member

timholy commented Sep 21, 2014

It's not a silly question at all. This should work, but I agree it doesn't. Since you're just starting (welcome!), let me show you how I figured out where the problem is. Unfortunately, I am not immediately sure what to do about it, but I've opened an issue in julia proper (see the reference link above).

julia> using Images, TestImages

julia> img = testimage("mandril")
RGB Image with:
  data: 512x512 Array{RGB{UfixedBase{Uint8,8}},2}
  properties:
    IMcs: sRGB
    spatialorder:  x y
    pixelspacing:  1 1

julia> @which mean(img)
mean(A::AbstractArray{T,N}) at statistics.jl:17

julia> @which std(img)
std(A::AbstractArray{T,N}) at statistics.jl:204

julia> edit("statistics.jl", 204)  # note that this calls to var, unsurprisingly

julia> @which var(img)
var{T}(A::AbstractArray{T,N}) at statistics.jl:162   # same file, so just navigate to line 162 in your editor

julia> @which varm(img, mean(img))
ERROR: no method found for the specified argument types
 in which at interactiveutil.jl:201

# So we've found the problem. What methods does varm have?
julia> methods(varm)
# 4 methods for generic function "varm":
varm(v::Range{T},m::Number) at statistics.jl:180
varm{T}(A::AbstractArray{T,N},m::Number) at statistics.jl:141
varm{T}(A::AbstractArray{T,N},m::AbstractArray{T,N},region) at statistics.jl:157
varm(iterable,m::Number) at statistics.jl:175

So the problem is that varm is too strongly typed: it specifies that the mean value should be a Number, and an RGB is not a number.

The solution is to remove the typing in varm. This will be a little tricky (for a variety of reasons, mostly because it cleverly computes the covariance in multidimensional cases), so I will tackle fixing this in base julia myself.

However, this fix will presumably only be available if you're running a development version of julia; for most users, we recommend sticking with 0.3. My guess is that Images needs a custom implementation of std and var to handle this---this should be much simpler, and just needs a couple of methods with signatures like

import Base: std, var
function std{T<:ColorType}(img::AbstractArray{T})
    ...
end

Unlike perhaps some other languages you may be familiar with, simple loops suffice to implement high-performance versions of such functions.

I'll wait to hear back from you about whether you want to try tackling those as a good starter project in Julia, or whether you want me to handle it.

@philtomson
Copy link
Author

Tim,

Thanks for the very thorough response. Especially the debugging demo -
that's handy.

On Sun, Sep 21, 2014 at 3:05 AM, Tim Holy notifications@github.com wrote:

It's not a silly question at all. This should work, but I agree it
doesn't. Since you're just starting (welcome!), let me show you how I
figured out where the problem is. Unfortunately, I am not immediately sure
what to do about it, but I've opened an issue in julia proper (see the
reference link above).

julia> using Images, TestImages
julia> img = testimage("mandril")RGB Image with:
data: 512x512 Array{RGB{UfixedBase{Uint8,8}},2}
properties:
IMcs: sRGB
spatialorder: x y
pixelspacing: 1 1
julia> @which mean(img)mean(A::AbstractArray{T,N}) at statistics.jl:17
julia> @which std(img)std(A::AbstractArray{T,N}) at statistics.jl:204
julia> edit("statistics.jl", 204) # note that this calls to var, unsurprisingly
julia> @which var(img)var{T}(A::AbstractArray{T,N}) at statistics.jl:162 # same file, so just navigate to line 162 in your editor
julia> @which varm(img, mean(img))ERROR: no method found for the specified argument types
in which at interactiveutil.jl:201

So we've found the problem. What methods does varm have?julia> methods(varm)# 4 methods for generic function "varm":varm(v::Range{T},m::Number) at statistics.jl:180varm{T}(A::AbstractArray{T,N},m::Number) at statistics.jl:141varm{T}(A::AbstractArray{T,N},m::AbstractArray{T,N},region) at statistics.jl:157varm(iterable,m::Number) at statistics.jl:175

So the problem is that varm is too strongly typed: it specifies that the
mean value should be a Number, and an RGB is not a number.

The solution is to remove the typing in varm. This will be a little
tricky (for a variety of reasons, mostly because it cleverly computes the
covariance in multidimensional cases), so I will tackle fixing this in base
julia myself.

However, this fix will presumably only be available if you're running a
development version of julia; for most users, we recommend sticking with
0.3. My guess is that Images needs a custom implementation of std and var
to handle this---this should be much simpler, and just needs a couple of
methods with signatures like

import Base: std, varfunction std{T<:ColorType}(img::AbstractArray{T})
...end

So I notice you're importing the std and var there from base. Are you
suggesting that we can just implement std for ColorType using some type
coersions and call the underlying var and std from base?

Unlike perhaps some other languages you may be familiar with, simple loops
suffice to implement high-performance versions of such functions.

Loops get vectorized?

I'll wait to hear back from you about whether you want to try tackling
those as a good starter project in Julia, or whether you want me to handle
it.

I'll take a look. I'm very much a newbie at this point. But I think julia
can really help us out on a project at work and I think the gains are such
that I can put up with a bit of getting cut on the bleeding edge ;) In
fact, one of the things I needed to do for the project is find the std
deviation on a series of images which is how I ran into this problem.

Do you hangout on the #julia IRC channel and if so what times? (in case I
have... errr... in the likelihood that I have questions)

Phil


Reply to this email directly or view it on GitHub
#187 (comment).

@timholy
Copy link
Member

timholy commented Sep 21, 2014

So I notice you're importing the std and var there from base. Are you
suggesting that we can just implement std for ColorType using some type
coersions and call the underlying var and std from base?

Julia distinguishes functions from methods. Here's a trivial example:

f(x::Integer) = x+1
f(x::FloatingPoint) = x+2

f is a function which has two methods.

If you want to extend a function defined in another module, you have to import it, and then you can add new methods. We'd just need a method that works for ColorTypes, i.e., skips that call to varm and does it manually.

Loops get vectorized?

Vectorization has two meanings. In Matlab, vectorized means "I take a large collection of objects and repetitively do the same thing to each element, doing the looping in a fast language like C." Julia is a fast language (as fast as C), so there's no need for vectorization in this sense. For example, compare

julia> a = rand(10^7);

julia> b = sin(a);

julia> @time b = sin(a);
elapsed time: 0.23192822 seconds (80000128 bytes allocated)

julia> function mysin(a)
           b = similar(a)
           for i = 1:length(a)
               b[i] = sin(a[i])
           end
           b
       end
mysin (generic function with 1 method)

julia> mysin(a);

julia> @time mysin(a);
elapsed time: 0.184923681 seconds (80000128 bytes allocated)

# Matlab
>> a = rand(10^7,1);
>> tic; b = sin(a); toc
Elapsed time is 0.184815 seconds.

So you can basically write any function you want in Julia and not worry about whether you can vectorize it. This is incredibly freeing.

The other meaning of vectorization refers to SIMD operations on the CPU.

I'll take a look. I'm very much a newbie at this point. But I think julia
can really help us out on a project at work and I think the gains are such
that I can put up with a bit of getting cut on the bleeding edge ;) In
fact, one of the things I needed to do for the project is find the std
deviation on a series of images which is how I ran into this problem.

In that case, I encourage you to try implementing a version of std for a single image that we can incorporate into Images; I'll coach you through any problems you have. (You can look at base/statistics.jl for inspiration, but don't be surprised if you end up with an algorithm that is actually better than some of the ones there---just looking at that source I can see a couple of problems beyond the one that triggered this report.) You'll learn enough in the process that it will be easier for you to write really efficient algorithms for other purposes.

Do you hangout on the #julia IRC channel and if so what times?

I don't IRC, sorry. But I'm happy to continue the discussion thread here.

@philtomson
Copy link
Author

On Sun, Sep 21, 2014 at 1:26 PM, Tim Holy notifications@github.com wrote:

Vectorization has two meanings. In Matlab, vectorized means "I take a
large collection of objects and repetitively do the same thing to each
element, doing the looping in a fast language like C." Julia is a fast
language (as fast as C), so there's no need for vectorization in this
sense. For example, compare

julia> a = rand(10^7);
julia> b = sin(a);
julia> @time b = sin(a);elapsed time: 0.23192822 seconds (80000128 bytes allocated)
julia> function mysin(a)
b = similar(a)
for i = 1:length(a)
b[i] = sin(a[i])
end
b
endmysin (generic function with 1 method)
julia> mysin(a);
julia> @time mysin(a);elapsed time: 0.184923681 seconds (80000128 bytes allocated)

Matlab>> a = rand(10^7,1);>> tic; b = sin(a); tocElapsed time is 0.184815 seconds.

So you can basically write any function you want in Julia and not worry
about whether you can vectorize it. This is incredibly freeing.

The other meaning of vectorization refers to SIMD operations on the CPU
https://software.intel.com/en-us/articles/vectorization-in-julia.

I was referring to the 2nd meaning having to do with SIMD. As I understand
it, vector/matrix ops in matlab/octave take advantage of this (as does
Julia, correct?) So I was reading your looping statement as saying that the
loops get translated into SIMD operations in Julia.

Reply to this email directly or view it on GitHub
#187 (comment).

@timholy
Copy link
Member

timholy commented Sep 21, 2014

That article I linked to explains more, but basically, yes. For some reason LLVM 3.3 is reluctant to vectorize Float64s, but Float32s and integers vectorize quite nicely, and I think LLVM 3.6 (currently under development, but my understanding is that Julia will move to in the not too distant future) seems to vectorize Float64s. So it's already quite good, and will continue to improve.

@philtomson
Copy link
Author

On Sun, Sep 21, 2014 at 1:26 PM, Tim Holy notifications@github.com wrote:

So I notice you're importing the std and var there from base. Are you
suggesting that we can just implement std for ColorType using some type
coersions and call the underlying var and std from base?

Julia distinguishes functions from methods. Here's a trivial example:

f(x::Integer) = x+1f(x::FloatingPoint) = x+2

f is a function which has two methods.

If you want to extend a function defined in another module, you have to
import it, and then you can add new methods. We'd just need a method that
works for ColorTypes, i.e., skips that call to varm and does it manually.

For my purposes, since I really just need to get a float (or actually a
FixedPoint, but I suspect it's easy to get a fixed point from a float) that
represents the standard deviation, could I just use map to give me an array
of floats from the ColorTypes array and then run std on that?

(later I could implement something that would be more general in Images
that works on ColorTypes directly as you suggest, but for now I'd like to
be able to demo this to a coworker tomorrow to show how we could easily get
a standard deviation of an jpeg image)

Phil

@philtomson
Copy link
Author

Actually, when I do:

data(img)

I see that it's:

512x512 Array{RGB{UfixedBase{Uint8,8}},2}:

Each entry looks like:

RGB{Ufixed8}(0.643,0.588,0.278)

So RGB looks like a triple of floats, so I would think that I could
use map to extract each element of the triple into it's own 512x512
array and then run std each of those three Arrays with the end result
being a triple like: (std(r_array),std(g_array),std(b_array))

Does that seem like a reasonable approach for now?

Phil

On Sun, Sep 21, 2014 at 8:43 PM, Phil Tomson philtomson@gmail.com wrote:

On Sun, Sep 21, 2014 at 1:26 PM, Tim Holy notifications@github.com
wrote:

So I notice you're importing the std and var there from base. Are you
suggesting that we can just implement std for ColorType using some type
coersions and call the underlying var and std from base?

Julia distinguishes functions from methods. Here's a trivial example:

f(x::Integer) = x+1f(x::FloatingPoint) = x+2

f is a function which has two methods.

If you want to extend a function defined in another module, you have to
import it, and then you can add new methods. We'd just need a method that
works for ColorTypes, i.e., skips that call to varm and does it manually.

For my purposes, since I really just need to get a float (or actually a
FixedPoint, but I suspect it's easy to get a fixed point from a float) that
represents the standard deviation, could I just use map to give me an array
of floats from the ColorTypes array and then run std on that?

(later I could implement something that would be more general in Images
that works on ColorTypes directly as you suggest, but for now I'd like to
be able to demo this to a coworker tomorrow to show how we could easily get
a standard deviation of an jpeg image)

Phil

@kmsquire
Copy link
Collaborator

Hi Phil, if you need to separate the planes, the separate function will do that for you (see https://github.com/timholy/Images.jl#storage-order-and-changing-the-representation-of-images). However, reinterpreting the array is probably a better way to go (read a little further down the section of the README linked above.).

Cheers, Kevin

@philtomson
Copy link
Author

Tim,

Tangential question related to Images: Can you extract jpegs from an MPEG
file? (know of any way to do this?)

Phil

On Sun, Sep 21, 2014 at 11:25 PM, Kevin Squire notifications@github.com
wrote:

Hi Phil, if you need to separate the planes, the separate function will
do that for you (see
https://github.com/timholy/Images.jl#storage-order-and-changing-the-representation-of-images).
However, reinterpreting the array is probably a better way to go (read a
little further down the section of the README linked above.).

Cheers, Kevin


Reply to this email directly or view it on GitHub
#187 (comment).

@timholy
Copy link
Member

timholy commented Sep 22, 2014

Yes, see @kmsquire's lovely VideoIO package: https://github.com/kmsquire/VideoIO.jl

timholy added a commit that referenced this issue Jan 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants